The Sorting Application
The examples in this document all part of the Java Application Sorting. Download the source archive Sorting.zip and install it as a Java Application with Existing Sources. See the Using NetBeans document for details. Like others, the Sorting project has multiple Main Classes intended to illustrate various independent features. The simplest way to run the various classes is to right-click the file and select Run File either in the file itself or from the listing in the Projects window.Algorithmic comparisons
Sorting algorithms provide the ability for one to impress another computer scientist with his or her knowledge of algorithmic understanding. First of all, O(n*log n) acts as a lower bound to how quickly we can sort using a comparison-based sorting algorithm (we can sort faster than that in certain special cases).algorithm | ¿stable? | best time | average time | worst time | extra memory |
---|---|---|---|---|---|
selectionsort | no | O(n^{2}) | O(n^{2}) | O(n^{2}) | O(1) |
insertionsort | yes | O(n) | O(n^{2}) | O(n^{2}) | O(1) |
shellsort | no | O(n*log(n)) | O(n^{1.25})^{†} | O(n^{1.5}) | O(1) |
quicksort | no | O(n*log(n)) | O(n*log(n)) | O(n^{2}) | O(log(n)) |
mergesort | yes | O(n*log(n)) | O(n*log(n)) | O(n*log(n)) | O(n) ^{†} |
heapsort | no | O(n) | O(n*log(n)) | O(n*log(n)) | O(1) ^{†} |
The Shellsort algorithm discussed in this document uses the so-called Hibbard increments (see below). Although Shellsort is not theoretically as fast as others, it is still comparable in speed when arrays are not huge and enhanced sorting increments are employed.
Shellsort
Shellsort, named after its inventor, Donald Shell, relies upon the fact that insertion sort does very well if the array is nearly sorted. Another way of saying this, is that insertion sort does well if it does not have to move each item "too far". The idea is to repeatedly do insertion sort on all elements at fixed, decreasing distances apart: h_{k}, h_{k-1}, ..., h_{1}= 1. The choice of increments turns out to be crucial. It turns out that a good choice of increments are the so-called Hibbard increments:h_{1}= 1, h_{2}= 3, h_{3}= 7, ..., h_{k}= 2^{k}–1The original increments proposed by the algorithm's inventor were simple powers of 2, but the Hibbard increments do provably much better. To be able to use the h_{k} increment, you need an array of size at least h_{k}+1.
Psuedo-code
The psuedo-code for shellSort using the Hibbard increments is as follows:find k_{0} so that 2^{k0} - 1 < size for (k = k_{0}; k > 0; --k) { // from larger increments to smaller inc = 2^{k}- 1; // get array increment for (i = 0; i < inc; ++i) { insertionSort( [ a[i], a[i+inc], a[i+2*inc], ... ] ); } }The fact that the last increment in the sequence is 1 means that regular insertion sort is done at the last step and therefore the array is guaranteed to be sorted by this procedure. The point is that when the increments are larger, there are fewer elements and they will be moved further than simply interchanging adjacent elements. At the last step, we do regular insertion sort and hopefully the array is "nearly sorted" which makes insertion sort come close to its best case behavior of running in linear time. The notion that this is an speed improvement seems initially far-fetched. There are two enclosing for loops to get to an insertion sort, thus this algorithm has four enclosing loops.
Demo
The following is a demo of the sorting process of an array of size 11. Only 4 subarrays of 7 elements apart fit.0 1 2 3 4 5 6 7 8 9 10 - - - - - - - - - - -- [6, 7, 8, 6, 9, 7, 2, 2, 2, 9, 8] ^ ^ [2, 7, 8, 6, 9, 7, 2, 6, 2, 9, 8] ^ ^ [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8] ^ ^ [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8] ^ ^ [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8] 7's now sorted ^ ^ ^ ^ ---- 4 elements 3 apart [2, 2, 8, 2, 9, 7, 6, 6, 7, 9, 8] ^ ^ ^ ^ ---- 4 elements 3 apart [2, 2, 8, 2, 6, 7, 6, 8, 7, 9, 9] ^ ^ ^ ---- 3 elements 3 apart [2, 2, 7, 2, 6, 7, 6, 8, 8, 9, 9] 3's now sorted [2, 2, 2, 6, 6, 7, 7, 8, 8, 9, 9] regular insertion sort
ShellSort Code
Here is the code:
Shell Sort: Show
To get a sense of how it works, test it in ShowSorts by setting
// permit us to specify the shell sort increments externally static LinkedList<Integer> increments = null; static void shellSort(int[] a, int from, int to) { int size = to - from; if (increments == null) { // set increments to the default increments = new LinkedList<>(); for (int inc = 1; inc < size; inc = 2 * (inc + 1) - 1) { increments.add(inc); } } java.util.Iterator<Integer> iter = increments.descendingIterator(); while (iter.hasNext()) { int gap = iter.next(); for (int k = 0; k < gap; ++k) { // insertionSort on ( a[fromIndex+k], a[fromIndex+k+gap], ... ) for (int i = from + k + gap; i < to; i += gap) { int insert_value = a[i]; int j = i; for (; j >= from + k + gap; j -= gap) { if (a[j - gap] > insert_value) { a[j] = a[j - gap]; } else { break; } } if (j != i) { a[j] = insert_value; } } } } }
choice = "shell"Our code permits the user to specify the increments which the algorithms uses. If none are specified, the Hibbard increments are used by default. You can compare the behavior using a different set of increments by creating your own LinkedList of increments. For example, to use Shell's original increments (1, 2, 4, 8, ...), add this code prior to the call:
//------------- add this (and Fix Imports): LinkedList<Integer> shellIncrements = new LinkedList<>(); for (int inc = 1; inc < size; inc = 2 * inc) { shellIncrements.add(inc); } ShowAlgorithms.setShellSortIncrements(shellIncrements); //------------- just before the actual sort is called choices.get(choice).sort(A); }
Empirical Comparisons
The two valuable empirical measures of an sorting algorithm are- how long it takes to execute (real time).
- home many operations, i.e. comparisons and data movements, that it requires (abstract time).
choices.put("select", A -> Algorithms.selectionSort(A)); choices.put("insert", A -> Algorithms.insertionSort(A)); choices.put("shell", A -> Algorithms.shellSort(A)); // choices.put("quick", A -> Algorithms.quickSort(A)); // choices.put("merge", A -> Algorithms.mergeSort(A)); // choices.put("heap", A -> Algorithms.heapSort(A)); // choices.put("java", A -> Arrays.sort(A));
ShellSort Analysis
Stability
Shellsort is not stable. It can be readily demonstrated with an array of size 4 (the smallest possible). Instability is to be expected because the increment-based sorts move elements distances without examining of elements in between.Shellsort has O(n*log(n)) best case time
The best case, like insertion sort, is when the array is already sorted. Then the number of comparisons for each of the increment-based insertion sorts is the length of the array. Therefore we can determine:
comparisons =
n, for 1 sort with elements 1-apart (last step)
+ 3 * n/3, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * n/7, for 7 sorts with elements 7-apart
+ 15 * n/15, for 15 sorts with elements 15-apart
+ ...
Each term is n. The question is how many terms are there?
The number of terms is the value k such that
n, for 1 sort with elements 1-apart (last step)
+ 3 * n/3, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * n/7, for 7 sorts with elements 7-apart
+ 15 * n/15, for 15 sorts with elements 15-apart
+ ...
2^{k} - l < nSo k < log(n+1), meaning that the sorting time in the best case is less than n * log(n+1) = O(n*log(n)).
Shellsort worst case time is no worse than quadratic
The argument is similar as previous, but with a different overall computation. In a very worst-case scenario (which doesn't exist), each sort would be quadratic time.
comparisons ≤
n^{2}, for 1 sort with elements 1-apart (last step)
+ 3 * (n/3)^{2}, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * (n/7)^{2}, for 7 sorts with elements 7-apart
+ 15 * (n/15)^{2}, for 15 sorts with elements 15-apart
+ ...
And so, with a bit of arithmetic, we can see that the
number of comparisons is bounded by:
n^{2}, for 1 sort with elements 1-apart (last step)
+ 3 * (n/3)^{2}, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * (n/7)^{2}, for 7 sorts with elements 7-apart
+ 15 * (n/15)^{2}, for 15 sorts with elements 15-apart
+ ...
n^{2} * (1 + 1/3 + 1/7 + 1/15 + 1/31 + ...) < n^{2} * (1 + 1/2 + 1/4 + 1/8 + 1/16 + ...) = n^{2} * 2The last step uses the sum of the geometric series.
Shellsort worst and average times
The point about this algorithm is that the initial sorts, acting on elements at larger increments apart involve fewer elements, even considering a worst-case scenario. At these larger increments, "far-away" elements are swapped so that the number of inversions is dramatically reduced. At the later sorts at smaller increments, the behavior then comes closer to optimal.It can be proved that the worst-case time is sub-quadratic at O(n^{3/2}) = O(n^{1.5}). As can be expected, the proof is quite difficult. The textbook remarks that the average case time is unknown although conjectured to be O(n^{5/4}) = O(n^{1.25}). The textbook also mentions other increment sequences which have been studied and seen to produce even better performance.