QuickSort + MergeSort
— print (last updated: Sep 16, 2009) print

Select font size:

Discussion

The new sorting algorithms explored here are mergesort and quicksort. They are of the "fastest" variety in the sense that they ave worst-case and/or average-case time O(n*log n). It can be proved that a comparison-based sorting algorithm cannot do better than that in the average case. The other well know "fast" algorithm discussed in the textbook is heapsort. The things which differentiate these fast algorithms are Although shell sort is not included in this theoretically fastest group, it is still comparable in speed when its enhanced versions are used. Here is a rundown of the qualities of each:
  1. shellSort: stable, worst case is O(n1+f) where f is a small fraction for optimal increment sequences
  2. mergeSort: stable, O(n*log n) in worst case, requires O(n) extra memory.
  3. quickSort: unstable, O(n*log n) on the average, worst case is quadratic (which are convoluted element sequences unlikely to occur in practice); it is generally considered the "fastest" sorting technique
  4. heapsort: unstable, O(n*log n) in worst case.
Get demos of each using the mainShowSorts function by simply setting the which variable to the respective value and setting an appropriate size.

Both timing and operation-counting comparisons between these algorithms are useful. The mainTimeSorts and mainCountSorts test functions are already set up to do this comparison. Simply comment out the top section defining the current Sorts (as well as NUM and NUM_TRIALS in the former case) and uncomment out the section below that with new values.

QuickSort

The idea behind the quick sort algorithm is to choose, among the array elements, one which will serve as a pivot element used to split the array into two parts:
i
pivot pivot
0 n
The second phase calls quicksort recursively on the two parts. We must ensure that the i index satisfies 0 < i < n, so that both parts are non-empty; otherwise, quicksort could fall into infinite recursion!

The code is below. It represents a slight deviation from the textbook's code, but my tests indicate that it is slightly less costly than the textbook's version which does two extra swaps in order to make the right side have one less element.

Here is the code: ( click to show )

In the quickSort splitting code, the indices i and j control the splitting. We ensure that, prior to the recursive calls, The main loop moves i up and j down in a way that maintains this property. An important point is that i and j are not moved when their respective elements are equal to the pivot. This seems inefficient; however, in reality, it makes it more likely that the splitting will be closer to optimal (equal halves).

Median of 3

The usage of the medianOf3 helper function attempts to determine the best possible choice of pivot in constant time. The closer these two parts are to being the same size, the faster the algorithm will be. The medianOf3 function sorts the left, center, and right elements and then pivot is chosen to be the value of the center element. In particular, in a common order pattern of being "nearly sorted," medianOf3 is likely to choose a good pivot element.

CUTOFF value

The quicksort splitting method, in its usage of the medianOf3 operation, cannot be used for arrays of size 2. It turns out to be best to avoid the recursive calls down to an array which is "too small." QuickSort imposes a "cutoff" value whereby all arrays of size less than or equal to this cutoff will be sorted by an alternative method, insertionSort being a good choice. The optimal cutoff value can be empirically determined to be 9 or 10 by executing a main function which does counting for a range of cutoff values.

Here is the code: ( click to show )

Demonstrations

The mainShowSorts function is useful for giving demonstrations of quickSort. The one algorithmic difference to keep in mind is that the quickSort algorithm in show.Algorithms uses a default cutoff value of 5 for demonstration purposes, whereas the count.Algorithms version uses the optimal cutoff value of 10.

You can demonstrate of the effectiveness of the quickSort's splitting method by altering the generated values in the test array, e.g., try these:
size = 20;
for (int i = 0; i < A.length; ++i) {
   A[i] = 10 + r.nextInt(3 * size);      // random
//or,
   A[i] = 10 + i;                        // sorted
//or,
   A[i] = 10 + size - i;                 // reverse sorted
//or
   A[i] = 10;                            // all equal
}

Timing

Quicksort's best case time is time O(n*log n). The best case occurs when the split separates the array into two "equal" halves. In this case the timing function would satisfy a recurrence equation like this:
T(n) ≤ 2 * T(n/2) + C * n    // C * n = time to split
We can make a quick argument about this result following the book's lead:
T(n)/n ≤ 2 * T(n/2)/n + C
       = T(n/2) / (n/2)  +  C
       ≤ T(n/4) / (n/4)  +  C  +  C
       ...
       ≤ T(1) / 1  +  C  +  ... +  C      ( C repeated log(n) times )
       ≤ T(1) + C * log(n)
Thus
T(n) ≤ T(1) * n  +  C * n * log(n) = O(n*log n)

Average

The average time of quicksort is also O(n*log n). The math is a bit more involved (see the textbook). One thing to point out is that the assumption that the book is making is that the split index, i, is equally likely to be any position 0 < i < n. This is definitely not true in the sense that it "favors" the average behavior. By virtue of the choice of splits based on median-of-three, the split index is much more likely to occur towards the middle. Nevertheless, the mathematical analysis becomes even more complicated if one takes this into consideration.

Worst

Quicksort's worst case time is quadratic, i.e., Ω(n*log n), which occurs when the splitting consistently leaves up to, say, K elements in one portion or the other. In that case the timing would satisfy something like this:
T(n) ≥ T(n-K) + n   // the "n" term means split must "look at" every element
It turns out that a "worst-case" array is quite convoluted. Here is one example which you can plug in and test:
show.Algorithms.setQuicksortCutoff(2);
A = new Integer[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
                    2, 5, 1, 6, 3, 7, 0, 8, 4, 9
};
Sorting with this array consistently leaves two equal elements in the right-side split all the way down to the cutoff.

Another example which is simpler to generalize to a family of arrays of varying sizes (n = 32 in this case) is this:
A = new Integer[] { 
   10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
   10, 18, 10, 19, 10, 20, 10, 21, 10, 22, 10, 23, 10, 24, 10, 25
};
Again, sorting with this array consistently leaves two equal elements in the right side, but, in this case, only until the array is ½ its original size. Doing the math,
T(m) ≥ T(m-2) + m, m ≥ n/2, 
and we prove our claim as follows:
T(n) ≥ T(n-2) + n
     ≥ T(n-4) + (n-2) + n
     ≥ T(n/2-2) + n/2 + ... + (n-2) + n
     ≥ n/2 + ... + (n-2) + n            (at least n/4 terms, all ≥ n/2)
     ≥ n/2 + n/2 + ... + n/2            (n/4 times)
     = n2/8 

The point to make is that these worst case arrays are fully "cooked" in the sense that they are not representative of some common pattern and that most random deviations from these examples will not cause worst-case behavior. The significance is that quickSort is very unlikely to ever achieve its worst-case behavior in practice.

MergeSort

As mentioned above, the significance of mergeSort is that: Although not as fast as quickSort, these are two key selling points for using this algorithm. The one detraction is that it requires a spare array for the merge operations; for a huge array, this may be considered too costly.

Our first approach, which we will soon refine, is to think about sort the source array A "into" the target array B as follows:
  1. If the size is 0 there is nothing to do. If the size is 1, copy A into B.
  2. If the size is greater than 1, consider A and B as two halves: (A1,A2) and (B1,B2), respectively. Recursively mergesort A1 into B1 and mergesort A2 into B2
  3. Copy B into A.
  4. Merge the sorted halves (A1,A2) into B.
Letting T(n) be the worst-case number of comparisons and data movements for an array (of size n), we see that T(1) = 1, and that for n > 1, T(n) is made up of these counts:
  1. mergesort two halves of A into two halves of B: 2 * T(n/2)
  2. copy B into A: n
  3. merge the halves of A into B: O(n)
We conclude that:
T(1) = 1
T(n) = 2 * T(n/2) + O(n)
We've seen this recurrence many times before and know the solution is O(n*log n).

Examples

A practical implementation of this algorithm does fewer data movements than indicated above. The sketch above suggests that B is always the "merge target", which requires the step (b) of copying back to A before merging.

In fact, you want to make A and B reverse roles in the recursive calls so as to avoid the copy step (b). In any case, we want the final target of the merge to end up in the A array. This requirement means that when we get to the base case with singleton arrays, we either


A =  17161514 13121110 (base of recursion)
(copy elements)
B = 17161514 13121110
(merge 1's to 2's)
A =  16171415 12131011
(merge 2's to 4's)
B = 14151617 10111213
(merge 4's to 8's)
A = 10111213 14151617


A = 252423222120 1918 1716 151413121110 (base of recursion
— do nothing)
(merge 1's to 2's)
B = 24252223 20211819 16171415 12131011
(merge 2's to 4's)
A = 22232425 18192021 14151617 10111213
(merge 4's to 8's)
B = 18192021 22232425 10111213 14151617
(merge 8's to 16's)
A = 10111213 14151617 18192021 22232425

The Code

In order to effect the desired merge order, we pass a boolean variable target_is_a which indicates what to do at the merge step:
target_is_a = true:   A is the merge target
            = false;  B is the merge target
The initial call sets target_is_a to true, indicating that we want the final merge to go into to the A array. For each recursive step the boolean variable is negated, toggling between true and false all the way down to the base case of a single element.

When we get to the base case, a singleton array, if target_is_a is true, i.e., the element must end up in A and there is nothing to do because it is already there; otherwise the element must be copied into B.

Here is the code: ( click to show )

Stability

In order to achieve stability, care must be taken in the element maintain of the merge operation;
 while (i < middle || j < toIndex) {
   if (i == middle) {                // done with left side
     to[k++] = from[j++];
   } 
   else if (j == toIndex) {          // done with right side
     to[k++] = from[i++];
   } 
   else {
     if (c.compare(from[i],from[j]) <= 0) { // "<=" gives stability
       to[k++] = from[i++];
     } else {
       to[k++] = from[j++];
     }
   }
 }
The left side of the merge is indexed by i and the right side by j. Any equal elements in the sorted halves will be grouped together in correct order (this is an inductive assumption). If i and j are pointing to equal elements, we ensure that the left-side element, i.e., the one with index i, goes into the target array first:
  if (c.compare(from[i],from[j]) <= 0) {  // NOT: < 0 
    to[k++] = from[i++];


© Robert M. Kline