Priority Queues + HeapSort
— print (last updated: Sep 16, 2009) print

Select font size:

Priority Queues

A priority queue is like a queue in that it supports add and remove from "opposite sides" with operations: The usual FIFO queue is like a list in that add operations go on one end while element and remove are from the opposite end. In contrast, a priority queue is one with no particular structure; the requirement is that element and remove always use a smallest element with respect to the given order.

By using a suitable comparator, we can make easily change the notion of "smallest" by "largest". Ultimately the notion of the priority queue is quite general in that a FIFO queue can be seen a priority queue with comparison of elements "by time stamp on entering the queue."

Java Priority Queue

The Java priority queue is the appropriately named generic class PriorityQueue implementing the Queue interface. The key constructors are these: The initialCapacity parameter suggests that the underlying implementation is array-based. We will see such an implementation below, the standard implementation called the Binary Heap.

Heaps

A heap, or more strictly speaking, a min heap, is an implementation of a priority queue in the form of a heirarchical tree-like structure where, at each node, the element is smaller than or equal to the elements at the child nodes. In particular, the root must be holding the least element. A max heap is one in which the notion of ordering of elements is reversed. In practice, a max heap is simply a min heap in which we use the "reverse comparator."

Test Programs/NetBeans

The project we're using is PrioQueueDemo. The following test program illustrates several standard usages of Java's PriorityQueue class.
package prioqueuedemo;

import java.util.*;
//import heap.*;

public class Main {

  public static void main(String[] args) {
    mainBasic(args);
  }

  public static void mainBasic(String[] args) {

    Comparator<Integer> rev_cmp = new Comparator<Integer>() {
      @Override
      public int compare(Integer lhs, Integer rhs) {
        return rhs.compareTo(lhs);
      }
    };

    Queue<Integer> queue = new PriorityQueue<Integer>();
    Queue<Integer> queue_rev = new PriorityQueue<Integer>(20, rev_cmp);

    //Queue<Integer> queue = new BinaryHeap<Integer>();
    //Queue<Integer> queue_rev = new BinaryHeap<Integer>(20, rev_cmp);

    Random r = new Random();

    for (int i = 0; i < 10; ++i) {
      int x = r.nextInt(100);
      System.out.println("---- add(" + x + ")");
      queue.add(x);
      queue_rev.add(x);
    }

    System.out.println("queue:     " + queue);
    System.out.println("queue_rev: " + queue_rev);

    System.out.println("removing from queue:");
    while (!queue.isEmpty()) {
      System.out.println("---- remove = " + queue.remove());
    }

    System.out.println("removing from rev_queue:");
    while (!queue_rev.isEmpty()) {
      System.out.println("---- remove = " + queue_rev.remove());
    }
  }
}

Note the output of these statements:
System.out.println("queue:     " + queue);
System.out.println("queue_rev: " + queue_rev);
It may not make sense what this output reflects, but it actually reflects the underlying structure which we'll soon see.

BinaryHeap

Like our other user-defined data structures, extensions of the QueueAdapter class below give us the ability to use our user-defined BinaryHeap class as we would a PriorityQueue.

Create the class:
Class Name: QueueAdapter
package:    queue
Then insert the following content ( click to show )

Description

A Binary Heap is a heap which is maintained as a complete binary tree, meaning all levels are full except possibly the last level, which is filled out from the left. A binary heap is most easily implemented by an array. Suppose we use the array data in positions 1 through size, then A node n != 1 can always determine its parent node by the operation:
parent = n/2
Saying that a complete binary tree satisfies the heap property means that at each node n,
data[n/2] ≤ data[n]          (n != 1)
or, looking at it the other way:
data[n] ≤ data[2*n]          (when data[2*n] exists)
data[n] ≤ data[2*n+1]        (when data[2*n+1] exists)
Here is an example:
The coding issue involved in using a complete binary tree as a heap is in maintaining the heap property with insertions and removals.

Insertion: percolate up

Inserting an element, elt, means creating a new position in the array. This position is called a "hole" whose initial value is:
hole = ++size;
The operation we do is called "percolate up" in which the hole is moved to the correct position in the tree so that we can do the insertion:
data[hole] = elt
and know that the heap property is maintained:
data[hole/2] ≤ data[hole]
And so, if necessary, the hole moves up the tree from the leaf while the data in the parent is shifted down with an operation like this:
data[hole] = data[hole/2];
hole = hole/2;
For example, the steps to do the insertion of the value 17 in this tree would be this:
  1. Increase the size to 11 and create a hole at position 11 like this:
  2. Comparing 17 with the hole's parent would make the parent value shift down:
  3. Again, comparing 17 with the hole's parent would make the next parent value shift down:
  4. Finally, 17 is larger than the parent value at 1, so it gets inserted into the hole:

Removal: percolate down

Removal means deleting the node at the root, i.e., at position 1. This creates a hole into which we want to eventually insert the last element in the array which must give up its position:
elt = data[size--];
hole = 1;
The hole must move down the tree (percolate down) with an operation like:
hole = 2 * hole;       or        hole = 2 * hole + 1;
until we can set
data[hole] = elt
satisfied that the heap property is maintained:
elt ≤ data[2*hole]
elt ≤ data[2*hole+1]
When a hole goes down from parent to child, always choose a child with a smaller value.

Programmatically, percolate down is a more complicated operation due to having to choose between two children and determining how many children do deal with.

If we do a removal on the tree above (after the insert of 17 is complete) these are the steps;
  1. Return the value and create a hole at position 1. Pull out the value at node 11 (elt=31) and decrease the size to 10:
  2. Seeing that 31 is bigger than the smallest child, 16, move the hole down to position 3 and move 16 up:
  3. Again, seeing that 31 is bigger than the smallest child, 19, move the hole down to its position 6 and move 19 up:
  4. Finally, since the hole has no children we move 31 into position 6:

The code

Create the class:
Class Name: BinaryHeap
package:    heap
Then insert the following content ( click to show )

Test Programs

To test the effectiveness of this new class, in main1, you can uncomment the import line:
//import heap.*;
and replace the commented and uncommented sections:
Queue<Integer> queue = new PriorityQueue<Integer>();
Queue<Integer> queue_rev = new PriorityQueue(20, rev_cmp);

//Queue<Integer> queue = new BinaryHeap<Integer>();
//Queue<Integer> queue_rev = new BinaryHeap<Integer>(20, rev_cmp);
The class code contains non-standard features which illustrate the underlying tree structure. A second test program is better for showing this structure:
  public static void mainTestHeap(String[] args) {

    Queue<Integer> queue = new BinaryHeap<Integer>();

    Random r = new Random();

    for (int i = 0; i < 10; ++i) {
      int x = r.nextInt(100);
      System.out.println("------------ add(" + x + ")");
      queue.add(x);
      ((BinaryHeap)queue).reverseInorderOutput();
    }

    while (!queue.isEmpty()) {
      System.out.println("------------ remove = " + queue.remove());
      ((BinaryHeap)queue).reverseInorderOutput();
    }
  }

Analysis

We see that, since the tree maintained is always a complete binary tree, the height (when non-empty) is flr(log n). This becomes, up to a constant factor, the maximum number of comparisons + data movements for either an element insertion or deletion.

buildHeap operation

We will see that heapSort and other operations give us the raw data in the array and expect us to "heapify" this as the initial step. Although we can create the initial heap by a sequence of inserts, it turns out to be more efficient to use the separate buildHeap operation which is not part of the standard PriorityQueue.

The buildHeap process effectively goes row-by-row, starting from the second-to-last row and does a percolateDown operation, thus making the subtree a heap. Here is a depiction of the process (a "worst-case" situation):
  1. The raw array dara is of size 13 is
    [13, 11, 12, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6]
    Putting these directly in order into the tree gives:
  2. The process starts at node 6 (size/2) and works backward. After nodes 6, 5, 4 are done we have:
  3. After nodes 3, 2 are done we have:
  4. Finally, after node 1 is done:
Here is the actual code we execute:
public void buildHeap(E[] a) {

  // this part sets the size correctly
  int old_capacity = capacity;
  size = capacity = a.length;
  if (capacity > old_capacity) {
    data = (E[]) new Object[capacity + 1];
  }
  for (int i = 0; i < a.length; ++i) {
    data[i+1] = a[i];
  }

  // here is the actual code:
  for (int i = size/2; i > 0; --i) {
    int hole = i;
    E elt = data[hole];
    hole = percolateDown(hole, elt);
    data[hole] = elt;
  }
}

Test program

Here is a test program you can use to see the buildHeap operation isolated:
  public static void mainTestBuildHeap(String[] args) {
    int NUM = 10;
    Integer[] A = new Integer[NUM];
    Random r = new Random();
    for (int i = 0; i < A.length; ++i) {
      A[i] = r.nextInt(NUM * 3);
    }
    System.out.println( Arrays.toString(A) );

    BinaryHeap<Integer> heap = new BinaryHeap<Integer>();

    heap.buildHeap(A);
    heap.reverseInorderOutput();
  }

BuildHeap Time

As discussed in the book, this operation runs in linear time on the number of nodes. This can be proved from two facts about a perfect binary tree (all levels full): The total amount of work can be seen, in the worst case, as a constant times the sum of the heights of all nodes. If the tree were perfect, we express:
n = 2h+1– 1
then log(n+1) = h + 1 and and the sum of the heights would be
n – log(n+1) < n
If the tree is complete, but not necessarily perfect, we can see, from the first point above that
the number of nodes in a
perfect tree of height h+1
  <   2   *   the number of nodes in a
perfect tree of height h
And so, making a complete tree perfect will not quite double the number of nodes. Therefore,
the sum of the heights in a
complete tree with n nodes
  ≤   the sum of the heights of the next
perfect tree (with fewer than 2*n nodes)
  ≤     2 * n
We conclude that the time to do a buildHeap operation is linear. In contrast, if we were to add each element to the heap, one by one, the time would be O(n*log n).

HeapSort

The heapsort algorithm, as can be imagined, uses a binary heap to do its work. The heap is built as a max heap, using a reverse comparator. The following program gives the right idea of how heapsort works in two phases:
  1. buildHeap: take the array and run the buildHeap operation
  2. one at a time, remove from the (top of the) heap, re-inserting the elements into the array from right to left
  public static void mainHeapSort(String[] args) {
    int NUM = 20;

    Comparator<Integer> rev_cmp = new Comparator<Integer>() {
      @Override
      public int compare(Integer lhs, Integer rhs) {
        return rhs.compareTo(lhs);
      }
    };
    
    BinaryHeap<Integer> maxheap = new BinaryHeap<Integer>(NUM, rev_cmp);

    Integer[] A = new Integer[NUM];
    Random r = new Random();
    for (int i = 0; i < A.length; ++i) {
      A[i] = r.nextInt(NUM * 3);
    }
    System.out.println( Arrays.toString(A) );

    maxheap.buildHeap(A);
    
    System.out.println( "---------------------------" );
    maxheap.reverseInorderOutput();
    System.out.println( "---------------------------" );

    int i = A.length;
    while (! maxheap.isEmpty()) {
      A[--i] = maxheap.remove();
    }
    System.out.println( Arrays.toString(A) );
  }

The actual heapsort algorithm uses the array itself as the Binary Heap and thus does not need the heap separate from the array as is the case in our version. The difficulty in writing the actual heapsort algorithm is translating the array index range "fromIndex to toIndex" into the range "1 to size+1" which is natural for a complete binary tree.

HeapMap

A HeapMap is a class which can function as both a Heap in the sense of storing values (with possible dupicates) and also as a Map in which the values are associated to keys. There is no correlate of such a class in the java.util package.

A HeapMap is implemented as a Binary Heap whose elements are (key,value) pairs in which the elements are compared by the value component. It is a map in the sense that the keys are never duplicated. We use this public interface (among other standard simpler functions)
V put(K key, V value);
V get(K key);
Map.Entry<K,V> remove();
It combines the get/put of Map operations, with a revised remove function as might occur in a PriorityQueue in which the "least" element is removed. The put function will either add a new (key,value) pair if the key does not exist, or will modify an existing key's value.

Implementation

From an implementation perspective, the significant new structure to add and maintain is this:
private Map<K,Integer> indexOf = new HashMap<K,Integer>();
This data member maps keys to array positions, i.e., indexOf.get(key) is always the index of the array element containing the pair with that key. This must be continually reset whenever a pair move up or down in the heap.

The important added private member function is:
private V setValue(K key, V value);
which changes the value of an existing key, thereby causing the pair to either: The put operation determines if the key exists (indexOf(key) != null), and if it does, calls setValue. If the key doesn't exist the pair is added to the heap.

Some authors (like the textbook) express the changing a key's value as one of two alternatives decreaseKey or increaseKey, but it seems easier to use the single public put operation with its underlying helper function setValue to detect whether the value is increased or decreased and respond appropriately.

Code and test program

Create the class:
Class Name: HeapMap
package:    heap
Then insert the following content ( click to show )

Here is a sample program which uses type Double as the value, a common usage:
  public static void mainTestHeapMap(String[] args) {
    HeapMap<String, Double> heap = new HeapMap<String, Double>();

    String[] keys = {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K"};

    int x = 10;
    for (String key : keys) {
      heap.put(key, (double) x);
      x += 10;
    }

    heap.reverseInorderOutput();

    System.out.println( "\n---------------- heap.put(\"I\",2.0)" );
    heap.put("I", 2.0);
    heap.reverseInorderOutput();

    System.out.println( "\n---------------- heap.put(\"C\",5.0)" );
    heap.put("C", 5.0);
    heap.reverseInorderOutput();

    System.out.println( "\n---------------- heap.put(\"I\",300.0)" );
    heap.put("I", 300D);
    heap.reverseInorderOutput();

    System.out.println( "\n---------------- heap.put(\"D\",-Infinity)" );
    heap.put("D", Double.NEGATIVE_INFINITY);
    heap.reverseInorderOutput();

    System.out.println( "\n---------------- heap.remove()" );
    heap.remove();
    heap.reverseInorderOutput();
  }
}

As suggested by the last example in the test program, if all heap elements have finite values we can effect removal of a key from a HeapMap object (implemented as a min heap), using this code:
V remove(HeapMap heapmap, K key) {
  V val = heapmap.get(key);
  heapmap.put(key, Double.NEGATIVE_INFINITY);
  heapmap.remove();
  return val;
}


© Robert M. Kline