Average and Worstcase path lengths
In section 4.3.5, the textbook sets up a recurrence equation for computing the internal path length
D(n) = sum of the depths of all nodes
for a tree created by n random insertions.
By solving this equation (later in the textbook), it can be shown that
the
average internal path length over all binary trees with
n nodes is
D(n) = O(n*log(n))
When we say "all binary trees", we really mean
to generate the trees in this manner:
 Take a sequence of n ordered elements, like the integers 1,...,n.
 Take all permutations of this sequence.
 For each permutation, insert the elements into a binary tree in the order of the sequence (the first goes to the root, etc).
(average total path length for n nodes)/n = O(n*log(n))/n = O(log(n))
This figure from the textbook is an illustration of such
a random tree:
Figure 4.26 from
Weiss textbook
Weiss textbook
0 + 1 + 2 + … + (n1) = (n1)*n/2, i.e., O(n^{2}).
and the average path length is O(n)
The goal of advanced tree data structures is to ameliorate this worstcase scenario
by having its operations "rebalance" the tree under certain circumstances so that
the O(n) depths are avoided or not maintained.
AVL Search Trees
An AVL (AdelsonVelski/Landis) tree is a binary search tree which maintains the following heightbalanced "AVL property" at each node in the tree:abs( (height of left subtree) – (height of right subtree) ) ≤ 1Namely, the left and right subtrees are of equal height, or their heights differ by 1. Recall that the height of a tree is the maximum depth (from the root) of any node. A tree with one node has height 0. We will say that the empty (null) tree has height 1. It can be proved that an AVL tree with n nodes has height O(log(n)), and so any n search/insert/delete operations ensuring worstcase search cost of O(log(n)). The key idea behind the AVL tree is how a subtree is rebalanced when a node insertion or removal causes the AVL property to fail. Like the textbook, we will consider only insertions.
Rebalancing Strategies
Suppose that a node satisfies the AVL property and that an add goes into the left subtree. There will be two separate cases to consider: the add goes into the leftleft subtree
 the add goes into the leftright subtree
 single rotate (from the) left
 double rotate (from the) left
AVL Rebalancing Operations
The recursive add operation goes down the tree to the insertion point. Any AVLoutofbalance problem will be "discovered" and dealt with as you come out of the recursive call. Consider the first such "outofbalance" discovery and subsequent rebalancing operation. We want to argue two things: the rebalanced tree satisfies the AVL property at each node, and
 the height of the rebalanced tree is the height of tree prior to insertion.
single rotate (from the) left, on a leftleft insertion
This diagram depicts the rebalance operation after an unbalancing leftleft insertion.Proof: Let x, y, z be the heights of the trees X, Y, Z, respectively after insertion. Assume that the insertion into X caused an unbalancing at k_{2} (but not at k_{1} or at any node in X). We gather this information:
 since the insertion into X caused an unbalancing height change for the full tree, the full tree's height after insertion must be x + 2, and thus x + 1 before insertion.
 x must be ≥ than y + 1, otherwise the height of the left subtree at k_{2} would not have increased
 x cannot be greater than y + 1, otherwise the "AVL problem" would have been observed at k_{1}; conclude that x = y + 1
 x + 1 ≥ z + 2: otherwise there would be no problem at k_{2}
 x + 1 cannot be greater than z + 2, otherwise the left side height prior to insertion (x) would be greater than z + 1, meaning that k_{2} was unbalanced prior to insertion; conclude that x + 1 = z + 2, i.e., x = z + 1
x = y + 1, x = z + 1, and so y = z.For the rebalanced tree:
 at k_{2}, left side height = y = z = right side height
 at k_{1}, left side height = x = y + 1 = rightside height
 height of the tree = x + 1 = height before insertion
double rotate (from the) left, on a leftright insertion
This diagram depicts the rebalance operation after an unbalancing leftright insertion.Proof: Let a, b, c, d be the heights of the trees A, B, C, D, respectively after insertion. There are 3 cases to consider:
 the inserted element is k_{2}
 the inserted element goes into B
 the inserted element goes into C
 since the insertion into B caused an unbalancing height change of the full tree, the full tree's height after insertion must be b + 3, and thus b + 2 before insertion.
 b must be > c, otherwise the left subtree at k_{3} would not have changed height.
 b cannot be > c + 1, otherwise there would be a problem at k_{2}; conclude that b = c + 1
 b + 1 must be > a, otherwise the left subtree at k_{3} would not have changed height.
 b + 1 cannot be > a + 1, otherwise there would be a problem at k_{1}; conclude that b + 1 = a + 1, i.e., b = a
 in order for there to be a "problem" at k_{3}, we must have leftside height ≥ rightside height + 2, i.e., b + 2 ≥ d + 2

we cannot have b + 2 > d + 2, otherwise
b + 1 = left side height before insertion ≥ d + 2 = right side height before insertion + 2,meaning that the tree would have been unbalanced before insertion; conclude that b + 2 = d + 2, i.e., b = d
d = a = b = c + 1For the rebalanced tree:
 at k_{1}, left side height = a = b = right side height
 at k_{3}, left side height = c, rightside height = d = c + 1
 at k_{2}, left side height = b + 1 = rightside height = d + 1
 height of the tree = b + 2 = height before insertion
Examples
An important example of AVL trees is the behavior on a worstcase add sequence for regular binary trees:1, 2, 3, 4, 5, 6, 7All insertions are rightright and so rotations are all single rotate from the right. All but two insertions require rebalancing:
at 1 ⇒  at 3 ⇒ 
at 2 ⇒  at 5 ⇒ 
single rot. left at 50 ⇒ 
double rot. left at 10 ⇒ 
single rot. left at 25 ⇒ 
add(30), add(20), add(8) need no rebalancing 
double rot. right at 7 ⇒ 
Demo program
The class we wish to create, AVLTreeSet is an improvement of the SearchTreeSet class in the sense of always maintaining an O(log(n)) search time. For demonstration purpose, we will assume you have set up the TreeDemo project used in the TreeSets document. Within the tree package, create a Java Class:Class Name: AVLTreeSet package: treeThen insert the following content
AVL Tree Code
In order to keep track of height efficiently, the Node class is expanded to add a height field:private class Node<E> { E data; Node<E> left, right; int height; Node(E data, Node<E> left, Node<E> right, int height) { this.data = data; this.left = left; this.right = right; this.height = height; } Node(E data, Node<E> left, Node<E> right) { this(data, left, right, 0); } }
private int height(Node<E> p) { return (p == null) ? 1 : p.height; }
private static class MutableBoolean { boolean value; }
We start with code like this:
int comp = myCompare(elt, n.data); if (comp < 0) { // insertion goes left n.left = add(elt, n.left, found); ...
if (height(n.left)  height(n.right) == 2) {
int comp1 = myCompare(elt, n.left.data); if (comp1 < 0) { return s_rotate_left(n); // leftleft insertion } else { return d_rotate_left(n); // leftright insertion }
return s_rotate_left(n); // leftleft insertion
private Node<E> s_rotate_left(Node<E> n) { Node<E> k2 = n; Node<E> k1 = k2.left; int x = k1.left.height; // k1.left cannot be null k2.left = k1.right; k1.right = k2; k2.height = x; k1.height = x + 1; return k1; }
n = add(elt, n, found);
n = s_rotate_left(n);