Shellsort & Algorithmic Comparisons

The Sorting Application

The examples in this document all part of the Java Application Sorting. Download the source archive Sorting.zip and install it as a Java Application with Existing Sources. See the Using NetBeans document for details.

Like others, the Sorting project has multiple Main Classes intended to illustrate various independent features. The simplest way to run the various classes is to right-click the file and select Run File either in the file itself or from the listing in the Projects window.

Algorithmic comparisons

Sorting algorithms provide the ability for one to impress another computer scientist with his or her knowledge of algorithmic understanding. First of all, O(n*log n) acts as a lower bound to how quickly we can sort using a comparison-based sorting algorithm (we can sort faster than that in certain special cases).
algorithm ¿stable? best time average time worst time extra memory
selectionsort no O(n2) O(n2) O(n2) O(1)
insertionsort yes O(n) O(n2) O(n2) O(1)
shellsort no O(n*log(n)) O(n1.25) O(n1.5) O(1)
quicksort no O(n*log(n)) O(n*log(n)) O(n2) O(log(n))
mergesort yes O(n*log(n)) O(n*log(n)) O(n*log(n)) O(n)
heapsort no O(n) O(n*log(n)) O(n*log(n)) O(1)
conjectured
Of the algorithms which share the same order class, a second consideration is then the value of the big-O order constant. Even though it can have quadratic worst-case time, Quicksort is often considered the fastest algorithm on random arrays; the implication is that it has the smaller order constant, than, say Heapsort.

The Shellsort algorithm discussed in this document uses the so-called Hibbard increments (see below). Although Shellsort is not theoretically as fast as others, it is still comparable in speed when arrays are not huge and enhanced sorting increments are employed.

Shellsort

Shellsort, named after its inventor, Donald Shell, relies upon the fact that insertion sort does very well if the array is nearly sorted. Another way of saying this, is that insertion sort does well if it does not have to move each item "too far". The idea is to repeatedly do insertion sort on all elements at fixed, decreasing distances apart: hk, hk-1, ..., h1= 1. The choice of increments turns out to be crucial. It turns out that a good choice of increments are these:
h1= 1, h2= 3, h3= 7, ..., hk= 2k–1
These increments are termed the Hibbard increments. The original increments suggested by the algorithm's inventor were simple powers of 2, but the Hibbard increments do provably much better. To be able to use the hk increment, you need an array of size at least hk+1.

Psuedo-code

The psuedo-code for shellSort using the Hibbard increments is as follows:
find k0 so that 2k0 - 1 < size
for (k = k0; k > 0; --k) {  // from larger increments to smaller
  inc = 2k- 1
  for (i = 0; i < inc; ++i) {
    insertionSort( [ a[i], a[i+inc], a[i+2*inc], ... ] )
  }
}
The fact that the last increment in the sequence is 1 means that regular insertion sort is done at the last step and therefore the array is guaranteed to be sorted by this procedure. The point is that when the increments are larger, there are fewer elements and they will be moved further than simply interchanging adjacent elements. At the last step, we do regular insertion sort and hopefully the array is "nearly sorted" which makes insertion sort come close to its best case behavior of running in linear time.

The notion that this is an speed improvement seems initially far-fetched. There are two enclosing for loops to get to an insertion sort, thus this algorithm has four enclosing loops.

Demo

The following is a demo of the sorting process of an array of size 11. Only 4 subarrays of 7 elements apart fit.
   0  1  2  3  4  5  6  7  8  9  10
   -  -  -  -  -  -  -  -  -  -  --
  [6, 7, 8, 6, 9, 7, 2, 2, 2, 9, 8]
   ^                    ^          
  [2, 7, 8, 6, 9, 7, 2, 6, 2, 9, 8]
      ^                    ^       
  [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8]
         ^                    ^    
  [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8]
            ^                    ^ 

  [2, 2, 8, 6, 9, 7, 2, 6, 7, 9, 8]  7's now sorted
   ^        ^        ^        ^      ---- 4 elements 3 apart
  [2, 2, 8, 2, 9, 7, 6, 6, 7, 9, 8]
      ^        ^        ^        ^   ---- 4 elements 3 apart
  [2, 2, 8, 2, 6, 7, 6, 8, 7, 9, 9] 
         ^        ^        ^         ---- 3 elements 3 apart

  [2, 2, 7, 2, 6, 7, 6, 8, 8, 9, 9]  3's now sorted

  [2, 2, 2, 6, 6, 7, 7, 8, 8, 9, 9]  regular insertion sort

Java ShellSort Code

Here is the code:
Shell Sort: Show Hide
  // permit us to specify the shell sort increments externally
  static java.util.LinkedList<Integer> increments = null;
 
  static void shellSort(int[] a, int from, int to) {
    int size = to - from;
 
    if (increments == null) {
      // set increments to the default
      increments = new java.util.LinkedList<>();
      for (int inc = 1; inc < size; inc = 2 * (inc + 1) - 1) {
        increments.add(inc);
      }
    }
    java.util.Iterator<Integer> iter = increments.descendingIterator();
 
    while (iter.hasNext()) {
      int gap = iter.next();
      for (int k = 0; k < gap; ++k) {
        // insertionSort on ( a[fromIndex+k], a[fromIndex+k+gap], ...  )
        for (int i = from + k + gap; i < to; i += gap) {
          int insert_value = a[i];
          int j = i;
          for (; j >= from + k + gap; j -= gap) {
 
            if (a[j - gap] > insert_value) {
              a[j] = a[j - gap];
            }
            else {
              break;
            }
          }
          if (j != i) {
            a[j] = insert_value;
          }
        }
      }
    }
  }
To get a sense of how it works, test it in ShowSorts by setting
choice = "shell"
Our code permits the user to specify the increments which the algorithms uses. If none are specified, the Hibbard increments are used by default. You can compare the behavior using a different set of increments by creating your own LinkedList of increments. For example, to use Shell's original increments (1, 2, 4, 8, ...), add this code prior to the call:
// add this:
java.util.LinkedList<Integer> shellIncrements 
        = new java.util.LinkedList<>();
for (int inc = 1; inc < size; inc = 2 * inc) {
   shellIncrements.add(inc);
}
ShowAlgorithms.setShellSortIncrements(shellIncrements);
 
// just before the actual sort
    choices.get(choice).sort(A);
  }

Empiricial Comparisons

The two valuable empirical measures of an sorting algorithm are There are two timing test programs:
demo.TimeIntSorts  
amd
demo.TimeGenericSorts  
You can compare against any of 7 algorithms, although you will see a striking difference between the first two (slow) sorts and the remaining 5.
    choices.put("select", A -> Algorithms.selectionSort(A));
    choices.put("insert", A -> Algorithms.insertionSort(A));
    choices.put("shell", A -> Algorithms.shellSort(A));
//    choices.put("quick", A -> Algorithms.quickSort(A));
//    choices.put("merge", A -> Algorithms.mergeSort(A));
//    choices.put("heap", A -> Algorithms.heapSort(A));
//    choices.put("java", A -> Arrays.sort(A));

ShellSort Analysis

Stability

Shellsort is not stable. It can be readily demonstrated with an array of size 4 (the smallest possible). Instability is to be expected because the increment-based sorts move elements distances without examining of elements in between.

Shellsort has O(n*log(n)) best case time

The best case, like insertion sort, is when the array is already sorted. Then the number of comparisons for each of the increment-based insertion sorts is the length of the array. Therefore we can determine:
comparisons =
   n, for 1 sort with elements 1-apart (last step)
+ 3 * n/3, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * n/7, for 7 sorts with elements 7-apart
+ 15 * n/15, for 15 sorts with elements 15-apart
+ ...
Each term is n. The question is how many terms are there? The number of terms is the value k such that
2k - l  < n
So k < log(n+1), meaning that the sorting time in the best case is less than n * log(n+1) = O(n*log(n)).

Shellsort worst case time is no worse than quadratic

The argument is similar as previous, but with a different overall computation.
comparisons ≤
   n2, for 1 sort with elements 1-apart (last step)
+ 3 * (n/3)2, for 3 sorts with elements 3-apart (next-to-last step)
+ 7 * (n/7)2, for 7 sorts with elements 7-apart
+ 15 * (n/15)2, for 15 sorts with elements 15-apart
+ ...
And so, with a bit of arithmetic, we can see that the number of comparisons is bounded by:
  n2 * (1 + 1/3 + 1/7 + 1/15 + 1/31 + ...)
< n2 * (1 + 1/2 + 1/4 + 1/8 + 1/16 + ...)
= n2 * 2
The last step uses the sum of the geometric series.

Shellsort worst and average times

The point about this algorithm is that the initial sorts, acting on elements at larger increments apart involve fewer elements, even considering a worst-case scenario. At these larger increments, "far-away" elements are swapped so that the number of inversions is dramatically reduced. At the later sorts of sorts at smaller increments, the behavior then comes closer to optimal behavior.

It can be proved that the worst-case time is sub-quadratic at O(n3/2) = O(n1.5). As can be expected, the proof is quite difficult. The textbook remarks that the average case time is unknown although conjectured to be O(n5/4) = O(n1.25). The textbook also mentions other increment sequences which have been studied and seen to produce even better performance.


© Robert M. Kline