The class we wish to create,
AVLTreeSet is an "improvement" of the
SearchTreeSet class.
It can be installed directly into the tset package
of the TSetDemo project described in the
TreeSets + SearchTreeSet document.
Discussion
Recall that the height of a tree is the maximum depth (from the root)
of any node. A tree with one node has height 0. We will say that the
empty (null) tree has height -1.
An AVL (Adelson-Velski/Landis) tree
is a binary search tree which maintains the following height-balanced
"AVL property" at each node in the tree:
abs( (height of left subtree) (height of right subtree) ) ≤ 1
Namely, the left and right subtrees are of equal height, or their heights
differ by 1.
The key idea behind the AVL tree is how a subtree is rebalanced when an node
insertion or removal causes the AVL property to fail. We will consider only
insertions.
Rebalancing Strategy
Assuming Node n satisfies the AVL property and that
an insertion goes into the left subtree. If inserted node
is left child of n, then the AVL property will still hold
(think about it).
Otherwise, either
the insertion goes into the left-left subtree
the insertion goes into the left-right subtree
If either of these cause the left subtree to increase in height
such that the increased height causes the AVL property to fail,
we perform one of the following two respective
rotation operations:
single rotate (from the) left
double rotate (from the) left
In either case, the new tree will be balanced (i.e., the AVL property satisfied)
and will have height equal to the height of the tree prior to insertion. This means
that an insertion from the root down will only ever cause one rebalancing rotation
to occur to reestablish the AVL property for the entire tree.
Rebalancing Operations
Recursive insertion goes down the tree to the insertion point.
Any AVL-out-of-balance problem will be "discovered" and dealt with
as you come out of the recursive call.
So we assume we're talking about the first such discovery.
The implication is that when there is a "AVL problem" at a node, there cannot
be a problem at any nodes in the subtree rooted at that node.
We want to argue two things:
the rebalanced tree satisfies the AVL property at each node, and
the height of the rebalanced tree
is the height of tree prior to insertion.
In particular, only one rebalance ever needs to be done per insertion.
Aside from symmetric situations, there are two distinct cases to
consider. Suppose the insertion "goes left."
It's trivial to
see that if the inserted node is the left child,
then the new tree cannot be unbalanced. Therefore the cases are
based on whether the insertion goes to the left or right
of the left child.
1. single rotate (from the) left, on a left-left insertion
This diagram depicts the rebalance
operation
after an unbalancing left-left insertion.
Let x, y, z
be the heights of the trees
X, Y, Z, respectively after insertion. Assume that the insertion
into X caused an unbalancing at k2 (but not
at k1 or at any node in X).
We gather this information:
since the insertion into X caused an unbalancing height
change for the full tree, the full tree's height
after insertion must be x + 2,
and thus x + 1 before insertion.
x must be ≥ than y + 1,
otherwise the height of the left subtree at
k2 would not have increased
x cannot be greater than y + 1, otherwise the
"AVL problem" would have been observed at k1;
conclude that x = y + 1
x + 1 ≥ z + 2: otherwise there would be
no problem at k2
x + 1 cannot be greater than z + 2,
otherwise the left side height prior to insertion (x)
would be greater than z + 1, meaning that k2 was
unbalanced prior to insertion;
conclude that x + 1 = z + 2, i.e.,
x = z + 1
Putting these facts together:
x = y + 1, x = z + 1, and so y = z.
For the rebalanced tree:
at k2, left side height = y = z = right side height
at k1, left side height = x
= y + 1 =
right-side height
height of the tree = x + 1 = height before insertion
2. double rotate (from the) left, on a left-right insertion
This diagram depicts the rebalance
operation
after an unbalancing left-right insertion.
Let a, b, c, d
be the heights of the trees A, B, C, D,
respectively after insertion. There are 3 cases to consider:
the inserted element is k2
the inserted element goes into B
the inserted element goes into C
In the first case, with a bit of pondering, you can convince
yourself that the tree can consist only of the three nodes
k1,
k2
and
k3 with all
other subtrees empty.
We will show the proof for the second case
where the insertion goes into the B subtree;
the proof of insertion into the C subtree is symmetric.
since the insertion into B caused an unbalancing height
change of the full tree, the full tree's height
after insertion must be b + 3,
and thus b + 2 before insertion.
b must be > c, otherwise the left subtree at
k3 would not have changed height.
b cannot be > c + 1, otherwise there
would be a problem at k2;
conclude that b = c + 1
b + 1 must be > a, otherwise
the left
subtree at k3 would not have changed height.
b + 1 cannot be > a + 1, otherwise there
would be a problem at k1;
conclude that b + 1 = a + 1, i.e., b = a
in order for there to be a "problem" at k3, we
must have left-side height ≥ right-side height + 2, i.e.,
b + 2 ≥ d + 2
we cannot have b + 2 > d + 2, otherwise
b + 1 = left side height before insertion ≥
d + 2 = right side height before insertion + 2,
meaning that
the tree would have been unbalanced before insertion;
conclude that b + 2 = d + 2, i.e., b = d
Putting these facts together:
d = a = b = c + 1
For the rebalanced tree:
at k1, left side height = a = b = right side height
at k3, left side height = c,
right-side height = d = c + 1
at k2, left side height = b + 1
= right-side height = d + 1
height of the tree = b + 2 = height before insertion
AVL Tree Code
For demonstration purpose, we will assume you have set up the TSetDemo project
used in the TreeSets + SearchTreeSet document.
This defines a package tset containing the class SearchTreeSet definining a
standard binary search tree.
Within the tset package, create a Java Class:
Class Name: AVLTreeSet
package: tset
Then insert the following content
( click to show )
In order to keep track of height efficiently, the Node class
is expanded to add a height field:
We are, of course, assuming that n.left != null, but this is
implied by the fact that
height(n.left) - height(n.right) == 2.
Suppose the left-left insertion took place so that we call:
return s_rotate_left(n); // left-left insertion
The function code attempts to mimick the effects indicated
in the diagram above.
private Node s_rotate_left(Node n) {
Node k2 = n;
Node k1 = k2.left;
int x = k1.left.height; // k1.left cannot be null
k2.left = k1.right;
k1.right = k2;
k2.height = x;
k1.height = x + 1;
return k1;
}
The original call was:
n = add(elt, n, found);
The overall effect on the node alone would be:
n = s_rotate_left(n);
The element elt would, of course, not be found, i.e.,
found.set(false), and the new Node holding its value would
end up in the left subtree of n.
Demo programs
After installing the AVLTreeSet class, make it available to
the treedemo.Main class by changing the import line:
import tset.*;
Illustration of Rotations
public static void mainAVL1(String[] args) {
Set<Integer> tree = new AVLTreeSet<Integer>();
((AVLTreeSet) tree).setShowRotations(true);
int[] adds;
System.out.println("============ EXAMPLE\n");
adds = new int[] {80, 40, 20, 30, 25, 35};
for (int i : adds) {
System.out.println("---------- add " + i);
tree.add(i);
((AVLTreeSet) tree).reverseInorderOutput();
System.out.println();
}
System.out.println("============ EXAMPLE: sequential\n");
tree.clear();
adds = new int[] {10, 20, 30, 40, 50, 60, 70};
for (int i : adds) {
System.out.println("---------- add " + i);
tree.add(i);
((AVLTreeSet) tree).reverseInorderOutput();
System.out.println();
}
System.out.println("============ EXAMPLE: random\n");
tree.clear();
Random rnd = new Random();
adds = new int[7];
for (int i = 0; i < adds.length; ++i) { adds[i] = 1 + rnd.nextInt(50); }
for (int i : adds) {
System.out.println("---------- add " + i);
tree.add(i);
((AVLTreeSet) tree).reverseInorderOutput();
System.out.println();
}
}
Timing Comparison
These next programs
give a comparison of how well an AVL tree competes against a regular
Binary Search tree for random insertion patterns.
public static void mainAVL2(String[] args) {
Random rand = new Random();
int SIZE = 40000;
int RANDMAX;
RANDMAX = 10 * SIZE; // "few" duplicates in the list
RANDMAX = SIZE / 10; // "many" duplicates in the list
int[] L = new int[SIZE];
for (int i = 0; i < L.length; ++i) {
L[i] = rand.nextInt(RANDMAX);
}
long begin = 0, search_tree_done = 0, avl_tree_done = 0;
Set<Integer> st_set = new SearchTreeSet<Integer>();
Set<Integer> avl_set = new AVLTreeSet<Integer>();
for (int i = 0; i < 3; ++i) {
st_set.clear();
avl_set.clear();
begin = System.currentTimeMillis();
for (int n : L) {
st_set.add(n);
}
search_tree_done = System.currentTimeMillis();
for (int n : L) {
avl_set.add(n);
}
avl_tree_done = System.currentTimeMillis();
}
System.out.println(
"list size: " + L.length + "\n" +
"RANDMAX: " + RANDMAX + "\n" +
"set size: " + st_set.size() + "\n" +
"time for search tree: " + (search_tree_done - begin) + "ms\n" +
"time for avl tree: " + (avl_tree_done - search_tree_done) + "ms\n" +
"");
}
You may be disappointed to see that the AVL tree is not
fantastically much better
than the binary tree.
Of course there are circumstances in which the AVL tree
will do much better.