The new sorting algorithms
explored here are mergesort and quicksort.
They are of the
"fastest" variety in the sense that they
ave worst-case and/or average-case
time O(n*log n).
It can be proved that a comparison-based
sorting algorithm cannot do better than that in the average case.
The other well know "fast" algorithm discussed in the textbook is heapsort.
The things which
differentiate these fast algorithms are
what are their "order constants"
what are the worst-case behavior scenarios
Although shell sort is not included in this theoretically
fastest group, it is still comparable in speed when
its enhanced versions are used.
Here is a rundown of the qualities of each:
shellSort: stable, worst case
is O(n1+f)
where f is a small fraction for optimal increment sequences
mergeSort: stable, O(n*log n) in
worst case, requires O(n) extra memory.
quickSort: unstable, O(n*log n)
on the average, worst case is quadratic (which are convoluted element
sequences unlikely to occur in practice);
it is generally considered the "fastest" sorting technique
heapsort: unstable, O(n*log n) in
worst case.
Get demos of
each using the
mainShowSorts
function
by simply setting the which variable to
the respective value and setting an appropriate size.
Both timing and operation-counting comparisons
between these algorithms are useful.
The
mainTimeSorts
and
mainCountSorts
test functions are already set up to do this comparison.
Simply comment out the top section defining the current
Sorts
(as well as NUM and NUM_TRIALS in the former case)
and uncomment out the section below that with new values.
QuickSort
The idea behind the quick sort algorithm
is to choose, among the array elements, one which
will serve as a pivot element used to split
the array into two parts:
i
≤ pivot
≥ pivot
0
n
The second phase calls quicksort recursively on the two parts.
We must ensure that the i index satisfies 0 < i < n,
so that both parts are non-empty; otherwise,
quicksort could fall into infinite recursion!
The essence of quicksort's speed is to get the partitioning split to be as close as possible
to the middle.
Quicksort partitioning algorithm
sort the 3 elements { a[from], a[center], a[to-1] },
where center = (from + (to-1))/2
pivot = a[center]
i = from
j = to-1
while (true) {
while (a[++i] < pivot) { } // move i up and keep going if element is < pivot
while (pivot < a[--j]) { } // move j down and keep going if element is > pivot
if (i < j) {
swap(a, i, j);
} else {
break;
}
}
In the quickSort splitting code,
the indices i and j control the
splitting. We ensure that, prior to the recursive calls,
for k < i: a[k] ≤ pivot
for k ≥ i: a[k] ≥ pivot
The main loop moves i up and j down in a way that maintains
this property.
An important point is that
i and j are not moved when their respective
elements are equal to the pivot. This seems inefficient; however,
in reality, it makes it more likely that the splitting will be closer
to optimal (equal halves).
Here is the actual code:
( click to show )
My version represents a slight deviation from the
textbook's, but my tests indicate that it is slightly
less costly than the textbook's version which does two
extra swaps in order to make the right side have one less element.
Median of 3
The usage of the medianOf3 helper function attempts to determine
the best possible choice of pivot in constant time.
The closer these two parts are to being the same size, the
faster the algorithm will be.
The medianOf3 function sorts the left, center, and right elements
and then pivot is chosen to be the value of the center element.
In particular, in a common order pattern of being "nearly sorted,"
medianOf3 is likely to choose a good pivot element.
CUTOFF value
The quicksort splitting method,
in its usage of the medianOf3 operation,
cannot be used for arrays of size 2.
It turns out to be best to avoid the recursive calls down to an array
which is "too small."
QuickSort imposes a "cutoff" value whereby all arrays of size less than
or equal to this cutoff will be sorted by an alternative method,
insertionSort being a good choice.
The optimal cutoff value can be empirically determined to be 9 or 10
by executing a main function which does counting for a range of
cutoff values.
Here is the code:
( click to show )
Demonstrations
The mainShowSorts function is useful for giving demonstrations
of quickSort; simply set:
which = show.Algorithms.QUICK
The one algorithmic difference to keep in mind is that the
quickSort algorithm in show.Algorithms
uses a default cutoff value of 5 for demonstration purposes, whereas the
count.Algorithms version uses the optimal cutoff value of 10.
You can demonstrate of the effectiveness of the quickSort's splitting method
by altering the generated values in the test array, e.g., try these:
size = 20;
for (int i = 0; i < A.length; ++i) {
A[i] = 10 + r.nextInt(3 * size); // random
//or,
A[i] = 10 + i; // sorted
//or,
A[i] = 10 + size - i; // reverse sorted
//or
A[i] = 10; // all equal
}
Timing
Quicksort's best case time is time
O(n*log n).
The best case occurs when the split
separates the array into two "equal" halves. In this case the timing
function would satisfy a recurrence equation like this:
T(n) ≤ 2 * T(n/2) + C * n // C * n = time to split
We can make a quick argument about this result following the book's lead:
T(n)/n ≤ 2 * T(n/2)/n + C
= T(n/2) / (n/2) + C ( make the 2 factor divide the divisor )
≤ T(n/4) / (n/4) + C + C ( replace n by n/2 in previous )
...
≤ T(1) / 1 + C + ... + C ( C repeated log(n) times )
T(n)/n ≤ T(1) + C * log(n)
Thus
T(n) ≤ T(1) * n + C * n * log(n) = O(n*log n)
Average
The average time of quicksort is also O(n*log n). The math
is a bit more involved (see the textbook). One important thing to point out is that the assumption
in the book is that the split index, i, is equally likely to be any position
0 < i < n. This is definitely not true for the quickSort algorithm above
the pivot based on median-of-three, makes it far more likely that the split occur
towards the middle.
Nevertheless, the mathematical analysis becomes very involved if one takes this
into consideration.
Worst
Quicksort's worst case time is quadratic,
i.e., Ω(n2),
which occurs when the
splitting consistently leaves up to, say, K elements in
one portion or the other.
In that case the timing would satisfy something like this:
T(n) ≥ T(n-K) + n // the "n" term means split must "look at" every element
It turns out that a "worst-case" array is quite convoluted.
Here is one example which you can plug in and test:
Sorting with this array consistently leaves
two equal elements in the right-side split all the way down to the cutoff.
Another example which is simpler to generalize to a family of
arrays of varying sizes (n = 32 in this case) is this:
Again, sorting with this array consistently
leaves two equal elements in the right side, but, in this case,
only until the array is ½ its original size. Doing the math,
T(m) ≥ T(m-2) + m, m ≥ n/2,
and we prove our claim as follows:
T(n) ≥ T(n-2) + n
≥ T(n-4) + (n-2) + n
≥ T(n/2-2) + n/2 + ... + (n-2) + n
≥ n/2 + ... + (n-2) + n (at least n/4 terms, all ≥ n/2)
≥ n/2 + n/2 + ... + n/2 (n/4 times)
= n2/8
The point to make is that these worst case arrays are fully
"cooked" in the sense that
they are not representative
of some common pattern (nearly sorted, almost all equal)
most random deviations
from these examples will not cause worst-case behavior.
The significance is that quickSort is very
unlikely to ever achieve its worst-case behavior in practice.
MergeSort
The idea behind mergeSort is a pretty simple application of recursion:
mergeSort(A) {
if (A.length == 1) return; // nothing to do
split A into two halves: A1, A2
mergeSort(A1)
mergeSort(A2)
// the merge step, where we actually doing something
merge the sorted halves A1 and A2 back into A
}
MergeSort vs QuickSort recursion
MergeSort exhibits end order
or bottom up
recursion in that all the
significant actions take place after the recursive calls.
This means that the array is split all the way down to singletons
(size 1) and that all the actions take place on the way out.
QuickSort is exactly the opposite, it exhibits top down recursion
in that the main action of partitioning is done before the recursive
calls.
Analysis
Let T(n) be the worst-case number of comparisons and
data movements for an array (of size n), we see that for
n > 1, T(n) is made up of these counts:
recursive call on the two halves of A: 2 * T(n/2)
merge the halves of
A (in a stable manner): O(n)
and conclude that:
T(n) = 2 * T(n/2) + O(n)
We've seen this recurrence many times before and know the solution is
O(n*log n).
Thus the significance of mergeSort is that:
its worst case time is
O(n*log n).
it is stable
The one detraction is that it requires a spare array in order
to do the merge operations; for a huge array,
this may be considered too costly.
Our first inefficient approach is think of B as always
being the merge target.
If the size of A is 1,
copy A into B.
If the size is greater than 1,
call mergeSort
recursively to sort left half of A, A1,
into the left half of B, B1. Same with the right
half A2 into B2.
Copy B into A. A now consists
of two sorted halves.
Merge the sorted halves (A1,A2) into B.
At the completion of the algorithm, we would have to copy
B back to A.
Examples
A practical implementation of this algorithm does fewer data
movements than indicated above
by having
A and
B switch "roles" where one is the merge source and
the other the merge target.
In this way we completely avoid the copy step (b).
In any case, we want the final
merge target to be A.
As we go "down into the recursion," the merge target alternates:
A B A B ... (base of recursion: A or B)
This requirement means that when we get to the base case
with singleton arrays:
if B is the merge target: copy from A to B.
if A is the merge target: do nothing, the data is already in
A.
A =
17
16
15
14
13
12
11
10
(base of recursion)
(copy elements)
B =
17
16
15
14
13
12
11
10
(merge 1's to 2's)
A =
16
17
14
15
12
13
10
11
(merge 2's to 4's)
B =
14
15
16
17
10
11
12
13
(merge 4's to 8's)
A =
10
11
12
13
14
15
16
17
THE GOAL
A =
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
(base of recursion do nothing)
(merge 1's to 2's)
B =
24
25
22
23
20
21
18
19
16
17
14
15
12
13
10
11
(merge 2's to 4's)
A =
22
23
24
25
18
19
20
21
14
15
16
17
10
11
12
13
(merge 4's to 8's)
B =
18
19
20
21
22
23
24
25
10
11
12
13
14
15
16
17
(merge 8's to 16's)
A =
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
THE GOAL
The MergeSort Code
In order to manage the merge target array,
a
boolean variable target_is_a is passed into
the recursive calls. This boolean indicates what to do
at the merge step:
target_is_a = true: A is the merge target
= false; B is the merge target
The initial call assigns target_is_a to true,
indicating that we want the final merge to go into to the
A array.
For each recursive step the boolean is negated,
toggling between true and false
all the way down to the base case of a single element.
When we get to the base case, a singleton array,
if target_is_a is true, i.e., the element must end up in
A, then there is nothing to do because it is already there;
otherwise the element must be copied from B into B.
Here is the code:
( click to show )
Stability
In order to achieve stability,
care must be taken in the merge operation;
while (i < middle || j < toIndex) {
if (i == middle) { // done with left side
to[k++] = from[j++];
}
else if (j == toIndex) { // done with right side
to[k++] = from[i++];
}
else {
if (c.compare(from[i],from[j]) <= 0) { // "<=" gives stability
to[k++] = from[i++];
} else {
to[k++] = from[j++];
}
}
}
The left side of the merge is indexed by i and
the right side by j. Any equal elements
in the sorted halves will be grouped together in correct order
(this is an inductive assumption). If
i and j are pointing to equal elements,
we ensure that the left-side element, i.e., the one with index i,
goes into the target array first:
MergeSort will always have the same number of data moves for
arrays of equal sizes because the data is simply being
moved between source and target arrays for each of the "steps" of merge.
In terms of comparisons, the best case
for a single merge is when the largest in one half is less
than the smallest in the other half, like this
array of size 16:
18
19
20
21
22
23
24
25
10
11
12
13
14
15
16
17
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
With 8 comparisons, the "smaller" half (of size 8) goes
first and then "larger" half fills out the array
with no extra comparisons.
We would count roughly n/2 comparisons for merging the
halves of an array of size n
The worst case is when the array elements are shuffled like this:
10
12
14
16
18
20
22
24
11
13
15
17
19
21
23
25
The shuffled structure will force comparisons between halves
for each of the 15 pairs (n,n+1).
Roughly count n-1 comparisons for an array of size n,
approximately twice as many as the best case.
Working our way down with this shuffling strategy,
a worst case array of size 16 would be: