Quicksort & Mergesort

The Sorting Application

The examples in this document all part of the Java Application Sorting. Download the source archive Sorting.zip and install it as a Java Application with Existing Sources. See the Using NetBeans document for details.

Like others, the Sorting project has multiple Main Classes intended to illustrate various independent features. The simplest way to run the various classes is to right-click the file and select Run File either in the file itself or from the listing in the Projects window.

Java sorting

The quickSort and mergeSort algorithms are both fast although quickSort is considered the fastest. The mergeSort has the added advantage of being stable. Java uses both of them for its sorting methodology: These algorithm variations attempt to gain speed in sorting real-world situations where the data typically has many "sequential runs" as opposed to being purely random.

quickSort

The idea behind the quickSort algorithm is to choose, among the array elements, one which will serve as a pivot element used to split the array into two parts:
i
pivot pivot
0 n
The second phase calls quicksort recursively on the two parts. We must ensure that the i index satisfies 0 < i < n, so that both parts are non-empty; otherwise, quickSort would fall into infinite recursion! The essence of quickSort's speed is to get the partitioning split to be as close as possible to the middle.

Quicksort partitioning algorithm

A key step is how the pivot is chosen. We do so by sorting the elements at three positions:
a[from], a[center], a[to-1]     ( center = (from+to)/2 ) 
The pivot is set to a[center] and the starting indices are as follows:
a  
i j
     
from to-1
Having sorted the three key elements, we know before the loop starts that a[i] ≤ pivot and pivot ≤ a[j]. Here is partition algorithm pseudo-code:
// center = (from + to)/2;  
// sort the 3 elements { a[from], a[center], a[to-1] }
pivot = a[center];
i = from;
j = to-1;
while (true) {
  while (a[++i] < pivot) { }  // move i up and keep going if element is < pivot
 
  while (pivot < a[--j]) { }  // move j down and keep going if element is > pivot
 
  if (i < j) {
    // swap a[i] and a[j]
    int tmp = a[i];
    a[i] = a[j];
    a[j] = tmp;
  } 
  else {
    break;
  }
}
In the quickSort splitting code, the indices i and j control the splitting and the i index ends up being the correct marker so that: Key points are that: After the partitioning is complete, we call the algorithm recursively on both sides:
quickSort(a, from, i);  // include from, exclude i
quickSort(a, i, to);    // include i, exclude to
The key point of the partitioning which ensures that this actually works is that, after completion:
from < i < to
Without this guarantee, one of the recursive calls would be empty and the other would be to the entire array (possibly altered). This is the initial importance of the seemingly inefficient "<" comparison when comparing against the pivot element.

Median of 3

The usage of the medianOf3 helper function attempts to determine the best possible choice of pivot in constant time. The closer these two parts are to being the same size, the faster the algorithm will be. The medianOf3 function sorts the left, center, and right elements and then pivot is chosen to be the value of the center element. In particular, in a common order pattern of being "nearly sorted," medianOf3 is likely to choose a good pivot element.

Call on smaller subarrays first to minimize stack height

In order to keep the recursion stack height minimal, it is important to do the recursive call on the smaller sub-array first. Thus the recursive calls in our final version look like this:
if (i - from <= to - i) {
  quickSort(a, from, i);
  quickSort(a, i, to);
} 
else {
  quickSort(a, i, to);
  quickSort(a, from, i);
}
With this modification, the extra space usage (i.e., stack depth) is O(log(n)). Without this change, a worst-case scenario (see below) can cause an O(n) stack depth.

For demonstration purpose, we can ignore this code change. It has no effect on the total running time because all recursive calls must be made, regardless of the order in which they are done.

CUTOFF value

The quickSort splitting method, in its usage of the medianOf3 operation, cannot be used for arrays of size 1 or 2. In fact, it turns out to be best to avoid the recursive calls on an array which is "too small." Optimized quickSort implementations impose a CUTOFF value whereby all arrays of size less than or equal to this will be sorted by an alternative method, InsertionSort being a good choice.

The optimal CUTOFF value can be empirically determined to be 9 or 10 by executing a main function counting operations in a range of possible values.

Quicksort Code

My version represents a slight deviation from the textbook's version. My tests indicate that my version is slightly less costly than the textbook's version which does two extra swaps in order to make the right side have one less element.
quickSort code: Show Hide
  private static final int QUICKSORT_CUTOFF = 10;
 
  static void quickSort(int[] a, int from, int to) {
    if (to - from <= QUICKSORT_CUTOFF) {
      insertionSort(a, from, to);
    }
    else {
      int i = partition(a, from, to);
      quickSort(a, from, i);
      quickSort(a, i, to);
    }
  }
 
  // quickSort support functions
  private static int medianOf3(int a[], int from, int to) {
    int center = (from + to) / 2;
    int left = from;
    int right = to - 1;
 
    // do insertionsort on a[left], a[center], a[right]
    if (a[left] > a[center]) {
      int tmp = a[left];
      a[left] = a[center];
      a[center] = tmp;
    }
    // now a[left] <= a[center]
    int insert = a[right];
    if (a[center] > insert) {
      a[right] = a[center];
      if (a[left] > insert) {
        a[center] = a[left];
        a[left] = insert;
      }
      else {
        a[center] = insert;
      }
    }
    return a[center];
  }
 
  private static int partition(int[] a, int from, int to) {
    int pivot = medianOf3(a, from, to);
    int i = from, j = to - 1;
 
    while (true) {
      while (a[++i] < pivot) {
      }
      while (pivot < a[--j]) {
      }
 
      if (i < j) {
        int tmp = a[i];
        a[i] = a[j];
        a[j] = tmp;
      }
      else {              // swap, then advance both
        break;
      }
    }
    return i;
  }

Demonstrations

The ShowSorts program is useful for giving demonstrations of quickSort; simply set:
choice = "quick"
When quickSort is used the cutoff value is shown. You can easily modify the value by using the setter method prior to the last statement:
ShowAlgorithms.setQuicksortCutoff(6);
You can demonstrate of the effectiveness of the quickSort splitting method by altering the generated values in the test array, e.g., try these:
  public static int[] getIntArray(int size) {
    ...
    A[i] = 1 + r.nextInt(2 * size);   // random
//    A[i] = 10;                        // all equal
//    A[i] = 10 + i;                    // sorted
//    A[i] = 2*size - i;                // reverse sorted
    ...
  }
Here is an example with 11 elements; from=0 and to=11, Let the quicksort cutoff be 6 making it so that partitioning is only done once. You can create this by inserting the overwriting code:
    // choose the array of certain size
    int[] A;
    int size = 15;
    A = getIntArray(size);
    
    // insert overwriting code
    A = new int[]{1, 4, 6, 8, 5, 7, 10, 4, 1, 6, 6};
    ShowAlgorithms.setQuicksortCutoff(6);
The run starts like this:
=> quickSort(0,11)
 0  1  2  3  4  5   6  7  8  9 10
[1, 4, 6, 8, 5, 7, 10, 4, 1, 6, 6]
 ^              ^               ^
[1, 4, 6, 8, 5, 6, 10, 4, 1, 6, 7] pivot = 6
 _                              _
[1, 4, 6, 8, 5, 6, 10, 4, 1, 6, 7]
       _                     _     i = 2, j = 9
[1, 4, 6, 8, 5, 6, 10, 4, 1, 6, 7]
[1, 4, 6, 8, 5, 6, 10, 4, 1, 6, 7]
          _               _        i = 3, j = 8
[1, 4, 6, 8, 5, 6, 10, 4, 1, 6, 7]
[1, 4, 6, 1, 5, 6, 10, 4, 8, 6, 7]
                _      _           i = 5, j = 7
[1, 4, 6, 1, 5, 6, 10, 4, 8, 6, 7]
[1, 4, 6, 1, 5, 4, 10, 6, 8, 6, 7]
                _  __              i = 6, j = 5
[1, 4, 6, 1, 5, 4, 10, 6, 8, 6, 7]

=> quickSort(0,6) = insertionSort
 0  1  2  3  4  5                
[1, 4, 6, 1, 5, 4]                
[1, 1, 4, 4, 5, 6]                

=> quickSort(6,11) = insertionSort
                    6  7  8  9 10
                  [10, 6, 8, 6, 7]
                  [6, 6, 7, 8, 10]

[1, 1, 4, 4, 5, 6, 6, 6, 7, 8, 10]

Best Time

Quicksort best case time is O(n*log n). The best case occurs when the partitioning always separates the array into two "equal" halves. In this case the timing function (counting comparisons only) would satisfy a recurrence equation like this:
T(n) = n             // compare each element against the pivot 
       + 2 * T(n/2)  // make 2 recursive calls, one on on each half
We have solved this type of recurrence equation in the initial part of the course using induction. It should be direct to prove by induction that:
T(n) = O(n * log(n))

Average Time

The average time of quicksort is also O(n*log n). The math is a bit more involved (see the textbook). An important thing to point out is that the assumption made by the proof is that the partition split index, i, is equally likely to be any position 1 < i < n-1. This is not true for our optimized quickSort algorithm using the median-of-three pivot. In fact, the median-of-three does better; the split will occur closer to the middle on the average. The mathematical analysis becomes quite involved!

Worst Time

QuickSort worst case time is quadratic, i.e., Ω(n2), which occurs when the splitting consistently leaves up to, say, up to K elements in one portion or the other. In that case the timing would satisfy something like this:
T(n) ≥ n         // compare every element to pivot
       + T(n-K)  // sort the remaining part with at least n-K elements
For the sake of the proof, simply ignore the sort on the "up-to-K" portion. Again, using basic induction techniques, it is not hard to argue that
T(n) = Ω(n2)
Here is an example which you can test int ShowSorts in the manner indicated by the above example:
A = new int[] { 1, 1, 2, 3, 4, 5, 6, 7, 3, 5, 2, 6, 4, 7 };
ShowAlgorithms.setQuicksortCutoff(2);
Sorting this array consistently leaves the 2 maximum equal elements in the right-side split all the way down.

mergeSort

The idea behind mergeSort is a simple application of recursion:
mergeSort(A) {
  if (A.length == 1) return;  // nothing to do
  split A into two equal halves: A = (A1, A2)
  mergeSort(A1)
  mergeSort(A2)
  // then, the merge step, where we actually doing something
  // by merging the two sorted halves back into the full array:
  A = merge(A1,A2)
}

MergeSort vs QuickSort recursion

MergeSort exhibits end order or bottom up recursion in that all the significant actions take place after the recursive calls. This means that the array is split all the way down to singletons (of size 1) and that all actions take place on the way out of recursion.

In contrast, quickSort is exactly the opposite, namely top down recursion in the sense that the main action of partitioning is done before the recursive calls.

Analysis

Let T(n) be the worst-case number of comparisons and data movements for an array (of size n), we see that for n > 1, T(n) is made up of these counts:
  1. recursive call on the two halves of A: 2 * T(n/2)
  2. merge the halves of A (in a stable manner): O(n)
and conclude that:
T(n) = 2 * T(n/2) + O(n)
The merge operation is like this:
     0   1                 n/2               n
A: [ x1, x2, x3, ... ]   [ y1, y2, y3, ... ]    ⇒ B [ ...     ]
     i                     j                           k
We proceed via a loop, terminating when both the i and j indices have reached the end, i.e.:
if (i == n/2 && j == n) {
  // you're done
}
Proceeding through the loop, we choose which element to copy into the target array like this:
if (A[i] <= A[j]) {
  B[k++] = A[i++];
}
if (A[i] > A[j]) {
  B[k++] = A[j++];
}
There are also boundary conditions which must be tested prior to these:
if (i == n/2) { // only left side pointer reached the end
  B[k++] = A[j++];
} 
if (j == n) {  // only right-side pointer reached the end
  B[k++] = A[i++];
}
The arrays "halves" merged are either of equal size or the left side has one less:
n even:  left and right both have n/2 elements
n odd:   left has n/2 and right has (n/2)+1 elements
The fewest number of comparisons occur if the every element in the left side is ≤ every element in the right side, in which case only
n/2
comparisons are needed. The most number of comparisons occur when the two sides are "interleaved" like this, with n = 6:
[ 1, 3, 5 ]  [ 2, 4, 6 ]
In this case the merge proceeds like this:
[ 1, 3, 5 ]  [ 2, 4, 6 ] 
  *            *
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1,
     *         *
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1, 2,
     *            * 
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1, 2, 3
        *         * 
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1, 2, 3, 4
        *            *
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1, 2, 3, 4, 5,
           *         *
[ 1, 3, 5 ]  [ 2, 4, 6 ]   ⇒  [ 1, 2, 3, 4, 5, 6 ]
           *            *
All but the last step required a comparison. The most comparisons required is n-1. We conclude that:
T(n) ≤ 2*T(n/2) + n-1
We have seen this recurrence many times before and know the solution is O(n*log n).

Significance of MergeSort

Thus the significance of MergeSort is that: The only possible down side of MergeSort is that it requires a spare array in order to do the merge operations.

Efficient usage of merge target

Our first inefficient approach is think of B as always being the merge target.
  1. If the size of A is 1, copy A into B.
  2. If the size is greater than 1, call mergeSort recursively to sort left half of A, A1, into the left half of B, B1. Same with the right half A2 into B2.
  3. Copy B into A. A now consists of two sorted halves.
  4. Merge the sorted halves (A1,A2) into B.
At the completion of the algorithm, we would have to copy B back to A.

An efficient implementation of this algorithm does fewer data movements by having A and B switch "roles" where one is the merge source and the other the merge target. In this way we completely avoid the copy step (b).

In any case, we want the final merge target to be A. As we go into the recursion, the merge target alternates:
A ← B ← A ← B ← ... ← (base of recursion: A or B)
This requirement means that when we get to the base case with arrays of size 1, we require that
A =  17161514 13121110 (base of recursion)
↓ (copy elements)
B = 17161514 13121110
↓ (merge 1's to 2's)
A =  16171415 12131011 (sorted 2's)
↓ (merge 2's to 4's)
B = 14151617 10111213 (sorted 4's)
↓ (merge 4's to 8's)
A = 10111213 14151617 THE GOAL (sorted 8's)

A = 252423222120 1918 1716 151413121110 (base of recursion — do nothing)
↓ (merge 1's to 2's)
B = 24252223 20211819 16171415 12131011
↓ (merge 2's to 4's)
A = 22232425 18192021 14151617 10111213
↓ (merge 4's to 8's)
B = 18192021 22232425 10111213 14151617
↓ (merge 8's to 16's)
A = 10111213 14151617 18192021 22232425 THE GOAL

The MergeSort Code

In order to manage the merge target array, a boolean variable target_is_a is passed into the recursive calls. This boolean indicates what to do at the merge step:
target_is_a = true:   A is the merge target, B is the source
            = false:  B is the merge target, A is the source
The initial call assigns target_is_a to true, indicating that we want the final merge to go into to the A array. For each recursive step the boolean is negated, toggling between true and false all the way down to the base case of a single element.

When we get to the base case, a singleton array, if target_is_a is true, i.e., the element must end up in A, then there is nothing to do because it is already there; otherwise the element must be copied from A into B.

Here is the actual code used:
mergeSort code: Show Hide
  @SuppressWarnings("unchecked")
  static <E> void mergeSort(
      E[] a, int from, int to, Comparator<? super E> c) {
 
    E[] b = (E[]) new Object[a.length];
 
    boolean merge_to_a = true;  // merge direction: final merge => a
 
    mergeSort(a, b, merge_to_a, from, to, c);
  }
 
  // mergeSort support functions
  private static <E> void mergeSort(
      E[] a, E[] b, boolean merge_to_a, int from, int to, 
      Comparator<? super E> c) 
  {
    if (to - from == 0) {
      return;
    }
    if (to - from == 1) {
      if (!merge_to_a) {   // must end up in b
        b[from] = a[from];
      }
      return;
    }
 
    // recursively sort halves merging in the opposite direction
    int middle = (to + from) / 2;
    mergeSort(a, b, !merge_to_a, from, middle, c);
    mergeSort(a, b, !merge_to_a, middle, to, c);
 
    // merge in the direction indicated by merge_to_a
    E[] source, target;
    if (merge_to_a) {
      source = b;
      target = a;
    }
    else {
      source = a;
      target = b;
    }
 
    merge(source, target, from, middle, to, c);
  }
 
  private static <E> void merge(
      E[] source, E[] target, int from, int middle, int to, 
      Comparator<? super E> c) 
  {
    int i = from, j = middle, // source indices in from array
        k = from;             // target index in to array
 
    while (i < middle || j < to) {
      if (i == middle) {
        target[k++] = source[j++];
      }
      else if (j == to) {
        target[k++] = source[i++];
      }
      else {
        if (c.compare(source[i], source[j]) <= 0) {
          target[k++] = source[i++];
        }
        else {
          target[k++] = source[j++];
        }
      }
    }
  }

Stability

In order to achieve stability, care must be taken in the merge operation;
  private static <E> void merge(
      E[] source, E[] target, int from, int middle, int to, 
      Comparator<? super E> c) 
  {
    int i = from, j = middle, // source indices in from array
        k = from;             // target index in to array
 
    while (i < middle || j < to) {
      if (i == middle) {
        target[k++] = source[j++];
      }
      else if (j == to) {
        target[k++] = source[i++];
      }
      else {
        if (c.compare(source[i], source[j]) <= 0) {
          target[k++] = source[i++];
        }
        else {
          target[k++] = source[j++];
        }
      }
    }
  }
All equal elements in the sorted halves will be grouped together in correct order (this is an inductive assumption). If i and j are pointing to equal elements, we ensure that the left-side element, i.e., the one with index i, goes into the target array first:
  if (c.compare(source[i], source[j]) <= 0) {  // not "<"
    target[k++] = source[i++];
  }
  else {
    target[k++] = source[j++];
  }

Best/worse case number of comparisons

MergeSort will always have the same number of data moves for arrays of equal sizes because the data is simply being moved between source and target arrays for each of the "steps" of merge. In terms of comparisons, the best case for a single merge is when the largest in one half is less than or equal to the smallest in the other half, like this array of size 16:
18192021 22232425 10111213 14151617
10111213 14151617 18192021 22232425
With 8 comparisons, the "smaller" half (of size 8) goes first and then "larger" half fills out the array with no additional comparisons necessary. Count roughly n/2 comparisons for merging the halves of an array of size n

The worst case is when the array elements are shuffled like this:
10121416 18202224 11131517 19212325
The shuffled structure will force comparisons between halves for each of the 15 pairs. Count roughly n-1 comparisons to merge the halves on an array of size n, approximately twice as many comparisons as the best case.

Working our way down with this shuffling strategy, a worst case array of size 16 would be:
10181422 12201624 11191523 13211725
— then proceeding with the sort —
10181422 12201624 11191523 13211725     8 * 1 = 8 comparisons (but no changes)
10141822 12162024 11151923 13172125     4 * 3 = 12 comparisons
10121416 18202224 11131517 19212325     2 * 7 = 14 comparisons
10111213 14151617 18192021 22232425     15 comparisons


© Robert M. Kline