AVL Trees
— print (last updated: Nov 2, 2009) print

Select font size:
The class we wish to create, AVLTreeSet is an "improvement" of the SearchTreeSet class. It can be installed directly into the tset package of the TSetDemo project described in the TreeSets + SearchTreeSet document.

Discussion

Recall that the height of a tree is the maximum depth (from the root) of any node. A tree with one node has height 0. We will say that the empty (null) tree has height -1. An AVL (Adelson-Velski/Landis) tree is a binary search tree which maintains the following height-balanced "AVL property" at each node in the tree:
abs( (height of left subtree) – (height of right subtree) ) ≤ 1
Namely, the left and right subtrees are of equal height, or their heights differ by 1.

The key idea behind the AVL tree is how a subtree is rebalanced when an node insertion or removal causes the AVL property to fail. We will consider only insertions.

Rebalancing Strategy

Assuming Node n satisfies the AVL property and that an insertion goes into the left subtree. If inserted node is left child of n, then the AVL property will still hold (think about it). Otherwise, either If either of these cause the left subtree to increase in height such that the increased height causes the AVL property to fail, we perform one of the following two respective rotation operations: In either case, the new tree will be balanced (i.e., the AVL property satisfied) and will have height equal to the height of the tree prior to insertion. This means that an insertion from the root down will only ever cause one rebalancing rotation to occur to reestablish the AVL property for the entire tree.

Rebalancing Operations

Recursive insertion goes down the tree to the insertion point. Any AVL-out-of-balance problem will be "discovered" and dealt with as you come out of the recursive call. So we assume we're talking about the first such discovery. The implication is that when there is a "AVL problem" at a node, there cannot be a problem at any nodes in the subtree rooted at that node.

We want to argue two things:

In particular, only one rebalance ever needs to be done per insertion.

Aside from symmetric situations, there are two distinct cases to consider. Suppose the insertion "goes left." It's trivial to see that if the inserted node is the left child, then the new tree cannot be unbalanced. Therefore the cases are based on whether the insertion goes to the left or right of the left child.
1. single rotate (from the) left, on a left-left insertion
This diagram depicts the rebalance operation after an unbalancing left-left insertion.

Let x, y, z be the heights of the trees X, Y, Z, respectively after insertion. Assume that the insertion into X caused an unbalancing at k2 (but not at k1 or at any node in X). We gather this information:
  1. since the insertion into X caused an unbalancing height change for the full tree, the full tree's height after insertion must be x + 2, and thus x + 1 before insertion.
  2. x must be ≥ than y + 1, otherwise the height of the left subtree at k2 would not have increased
  3. x cannot be greater than y + 1, otherwise the "AVL problem" would have been observed at k1; conclude that x = y + 1
  4. x + 1 ≥ z + 2: otherwise there would be no problem at k2
  5. x + 1 cannot be greater than z + 2, otherwise the left side height prior to insertion (x) would be greater than z + 1, meaning that k2 was unbalanced prior to insertion; conclude that x + 1 = z + 2, i.e., x = z + 1
Putting these facts together:
x = y + 1, x = z + 1, and so y = z.
For the rebalanced tree:
  1. at k2, left side height = y = z = right side height
  2. at k1, left side height = x = y + 1 = right-side height
  3. height of the tree = x + 1 = height before insertion
2. double rotate (from the) left, on a left-right insertion
This diagram depicts the rebalance operation after an unbalancing left-right insertion.

Let a, b, c, d be the heights of the trees A, B, C, D, respectively after insertion. There are 3 cases to consider:
  1. the inserted element is k2
  2. the inserted element goes into B
  3. the inserted element goes into C
In the first case, with a bit of pondering, you can convince yourself that the tree can consist only of the three nodes k1, k2 and k3 with all other subtrees empty. We will show the proof for the second case where the insertion goes into the B subtree; the proof of insertion into the C subtree is symmetric.
  1. since the insertion into B caused an unbalancing height change of the full tree, the full tree's height after insertion must be b + 3, and thus b + 2 before insertion.
  2. b must be > c, otherwise the left subtree at k3 would not have changed height.
  3. b cannot be > c + 1, otherwise there would be a problem at k2; conclude that b = c + 1
  4. b + 1 must be > a, otherwise the left subtree at k3 would not have changed height.
  5. b + 1 cannot be > a + 1, otherwise there would be a problem at k1; conclude that b + 1 = a + 1, i.e., b = a
  6. in order for there to be a "problem" at k3, we must have left-side height ≥ right-side height + 2, i.e., b + 2 ≥ d + 2
  7. we cannot have b + 2 > d + 2, otherwise
    b + 1 = left side height before insertion ≥ d + 2 = right side height before insertion + 2,
    meaning that the tree would have been unbalanced before insertion; conclude that b + 2 = d + 2, i.e., b = d
Putting these facts together:
d = a = b = c + 1
For the rebalanced tree:
  1. at k1, left side height = a = b = right side height
  2. at k3, left side height = c, right-side height = d = c + 1
  3. at k2, left side height = b + 1 = right-side height = d + 1
  4. height of the tree = b + 2 = height before insertion

AVL Tree Code

For demonstration purpose, we will assume you have set up the TSetDemo project used in the TreeSets + SearchTreeSet document. This defines a package tset containing the class SearchTreeSet definining a standard binary search tree.

Within the tset package, create a Java Class:
Class Name: AVLTreeSet
package:    tset
Then insert the following content ( click to show )

In order to keep track of height efficiently, the Node class is expanded to add a height field:
private class Node {
  E data;
  Node left, right;
  int height;

  Node(E data, Node left, Node right, int height) {
    this.data = data;
    this.left = left;
    this.right = right;
    this.height = height;
  }

  Node(E data, Node left, Node right) {
    this(data, left, right, 0);
  }
}
A private member function computes the height of node so that null's height can also be computed:
private int height(Node p) { return (p == null) ? -1 : p.height; }
Let's consider how the usage of the single rotate left will take place. As in the regular search tree, we start with code like this:
int comp = myCompare(elt, n.data);
if (comp < 0) {                     // insertion goes left
   n.left = add(elt, n.left, found);
Next, add the test for being out of balance:
   if (height(n.left) - height(n.right) == 2) {
Assuming this is the case, we decide between single-rotate-left and double-rotate-left operations:
     int comp1 = myCompare(elt, n.left.data);
     if (comp1 < 0) {
       return s_rotate_left(n); // left-left insertion
     } else {
       return d_rotate_left(n); // left-right insertion
     }
   }
We are, of course, assuming that n.left != null, but this is implied by the fact that height(n.left) - height(n.right) == 2. Suppose the left-left insertion took place so that we call:
       return s_rotate_left(n); // left-left insertion
The function code attempts to mimick the effects indicated in the diagram above.
  private Node s_rotate_left(Node n) {
    Node k2 = n;
    Node k1 = k2.left;

    int x = k1.left.height;  // k1.left cannot be null

    k2.left = k1.right;
    k1.right = k2;

    k2.height = x;
    k1.height = x + 1;

    return k1;
  }
The original call was:
n = add(elt, n, found);
The overall effect on the node alone would be:
n = s_rotate_left(n);
The element elt would, of course, not be found, i.e., found.set(false), and the new Node holding its value would end up in the left subtree of n.

Demo programs

After installing the AVLTreeSet class, make it available to the treedemo.Main class by changing the import line:
import tset.*;

Illustration of Rotations

  public static void mainAVL1(String[] args) {
    Set<Integer> tree = new AVLTreeSet<Integer>();

    ((AVLTreeSet) tree).setShowRotations(true);

    int[] adds;

    System.out.println("============ EXAMPLE\n");
    adds = new int[] {80, 40, 20, 30, 25, 35};
    for (int i : adds) {
      System.out.println("---------- add " + i);
      tree.add(i);
      ((AVLTreeSet) tree).reverseInorderOutput();
      System.out.println();
    }

    System.out.println("============ EXAMPLE: sequential\n");
    tree.clear();

    adds = new int[] {10, 20, 30, 40, 50, 60, 70};
    for (int i : adds) {
      System.out.println("---------- add " + i);
      tree.add(i);
      ((AVLTreeSet) tree).reverseInorderOutput();
      System.out.println();
    }

    System.out.println("============ EXAMPLE: random\n");
    tree.clear();

    Random rnd = new Random();
    adds = new int[7];
    for (int i = 0; i < adds.length; ++i) { adds[i] = 1 + rnd.nextInt(50); }
    for (int i : adds) {
      System.out.println("---------- add " + i);
      tree.add(i);
      ((AVLTreeSet) tree).reverseInorderOutput();
      System.out.println();
    }
  }

Timing Comparison

These next programs give a comparison of how well an AVL tree competes against a regular Binary Search tree for random insertion patterns.
  public static void mainAVL2(String[] args) {
    Random rand = new Random();
    int SIZE = 40000;
    int RANDMAX;
    RANDMAX = 10 * SIZE;  // "few" duplicates in the list
    RANDMAX = SIZE / 10;    // "many" duplicates in the list
    int[] L = new int[SIZE];
    for (int i = 0; i < L.length; ++i) {
      L[i] = rand.nextInt(RANDMAX);
    }
    long begin = 0, search_tree_done = 0, avl_tree_done = 0;
    Set<Integer> st_set = new SearchTreeSet<Integer>();
    Set<Integer> avl_set = new AVLTreeSet<Integer>();

    for (int i = 0; i < 3; ++i) {
      st_set.clear();
      avl_set.clear();

      begin = System.currentTimeMillis();

      for (int n : L) {
        st_set.add(n);
      }

      search_tree_done = System.currentTimeMillis();

      for (int n : L) {
        avl_set.add(n);
      }

      avl_tree_done = System.currentTimeMillis();
    }

    System.out.println(
      "list size: " + L.length + "\n" +
      "RANDMAX:   " + RANDMAX + "\n" +
      "set size:  " + st_set.size() + "\n" +
      "time for search tree: " + (search_tree_done - begin) + "ms\n" +
      "time for avl tree:    " + (avl_tree_done - search_tree_done) + "ms\n" +
      "");

  }

You may be disappointed to see that the AVL tree is not fantastically much better than the binary tree. Of course there are circumstances in which the AVL tree will do much better.


© Robert M. Kline