7.11 Balanced Trees

A binary search tree provides an elegant framework for maintaining ordered data.

7.11 Balanced Trees

A binary search tree provides an elegant framework for maintaining ordered data. Search, insertion, deletion, predecessor queries, successor queries, and range scans all follow naturally from the ordering relationship between nodes.

There is one problem.

Nothing prevents a binary search tree from becoming badly shaped.

Consider inserting values in sorted order:

1
2
3
4
5
6
7

A naïve binary search tree becomes:

1
 \\n  2
   \\n    3
     \\n      4
       \\n        5
         \\n          6
           \\n            7

The structure now behaves like a linked list.

Search complexity degrades from:

O(log n)

to:

O(n)

Balanced trees solve this problem by actively maintaining a shape whose height remains logarithmic as elements are inserted and deleted.

Nearly every ordered map, database index, scheduler, filesystem directory structure, and storage engine relies on some form of balancing.

Problem

Suppose we build a binary search tree from sorted input.

10
20
30
40
50
60
70

Insertion sequence:

10
 \\n 20
   \\n   30
     \\n     40
       \\n       50
         \\n         60
           \\n           70

Searching for:

70

requires visiting every node.

The binary search tree has lost the logarithmic behavior that made it attractive.

We need a mechanism that preserves ordering while preventing extreme shapes.

Solution

Maintain balance after updates.

A balanced tree guarantees that the height grows proportionally to:

log n

rather than:

n

Many balancing strategies exist:

  • AVL trees
  • Red-black trees
  • Treaps
  • B-trees
  • AA trees
  • Weight-balanced trees

Although the details differ, the goal is always the same:

Prevent the tree from becoming too tall.

When height remains logarithmic, search operations remain logarithmic as well.

Why Height Matters

Consider this tree:

        40
       /  \\n     20    60
    / \    / \\n  10 30 50 70

Height:

3

Searching for:

70

requires:

40 → 60 → 70

Only three comparisons.

Now compare:

10
 \\n 20
   \\n   30
     \\n      40
       \\n        50
         \\n          60
           \\n            70

Height:

7

Searching for:

70

requires seven comparisons.

The ordering property remains intact, but performance collapses.

Height determines efficiency.

Perfect Balance

A perfectly balanced tree looks like:

         8
       /   \\n      4     12
     / \    / \\n    2   6 10 14

Every level is nearly full.

For:

15 nodes

the height is:

4

In general:

height ≈ log₂(n)

This logarithmic height is what balanced trees attempt to preserve.

Most balancing schemes do not require perfect balance. They only enforce enough structure to keep height within a constant factor of optimal.

Rotations

The fundamental balancing operation is a rotation.

Suppose:

    10
      \\n      20
        \\n        30

The tree is becoming skewed.

Perform a left rotation:

      20
     /  \\n   10    30

The ordering relationship remains unchanged.

An in-order traversal still produces:

10 20 30

The shape improves dramatically.

Rotations allow a tree to reorganize itself without violating ordering constraints.

Left Rotation

Before:

    x
     \\n      y
     / \\n    B   C

After rotation:

      y
     / \\n    x   C
   / \\n  A   B

The ordering property remains valid.

Every value in:

A < x < B < y < C

continues to satisfy binary search tree requirements.

This makes rotations safe and efficient.

Right Rotation

The mirror operation is a right rotation.

Before:

      y
     /
    x
   / \\n  A   B

After:

      x
     / \\n    A   y
       / \\n      B   C

Again, ordering remains unchanged.

Rotations are local operations.

Only a few nodes change relationships.

This locality makes balancing practical.

Balance Versus Perfect Balance

Balanced trees do not attempt to maintain perfect symmetry.

For example:

        50
       /  \\n     20    80
    /     /
   10    70

The tree is not perfectly balanced.

However, its height remains logarithmic.

Most balancing algorithms care about height bounds rather than visual symmetry.

This distinction is important.

Perfect balance would require excessive restructuring after every update.

Practical balance requires only enough adjustment to preserve performance guarantees.

Searching a Balanced Tree

Consider:

         40
       /    \\n      20     60
     / \    / \\n   10 30 50 70

Search for:

50

Path:

40
60
50

Three comparisons.

If the tree contains:

1,000,000

elements and remains balanced, height remains roughly:

20

Search operations therefore require approximately twenty comparisons.

This is the same asymptotic behavior as binary search on a sorted array.

Insertion with Rebalancing

Suppose:

    10
      \\n      20
        \\n        30

Insertions have produced imbalance.

A balancing algorithm detects the excessive height difference.

Rotation:

      20
     /  \\n   10    30

Restores acceptable structure.

The key insight is that balancing occurs only near the insertion path.

Most of the tree remains untouched.

This keeps updates efficient.

Deletion with Rebalancing

Deletion can also create imbalance.

Consider:

        40
       /  \\n     20    60
    / \\n   10 30

Removing nodes may cause one side of the tree to become significantly taller than the other.

Balancing algorithms detect the violation and apply rotations as necessary.

Deletion is usually more complex than insertion because multiple structural cases must be handled.

Nevertheless, the overall complexity remains logarithmic.

Arrays Versus Balanced Trees

Both sorted arrays and balanced trees provide logarithmic search.

Their update behavior differs significantly.

Operation Sorted Array Balanced Tree
Search O(log n) O(log n)
Insert O(n) O(log n)
Delete O(n) O(log n)
Range query O(log n + k) O(log n + k)
Cache locality Excellent Moderate

Arrays often win for read-heavy workloads.

Balanced trees win when updates occur frequently.

Choosing between them depends on workload characteristics.

Real-World Examples

Balanced trees appear throughout systems software.

Examples include:

  • Database indexes
  • Memory allocators
  • Operating system schedulers
  • Filesystem metadata indexes
  • Routing tables
  • Event queues
  • Language runtime libraries
  • Ordered collections

Although implementations differ, the underlying idea remains the same: maintain order while preserving logarithmic height.

Why Not Rebuild the Tree?

One possible strategy is:

  1. Allow imbalance.
  2. Periodically rebuild the entire tree.

This works in some applications.

However, rebuilding costs:

O(n)

and can introduce latency spikes.

Balanced trees distribute the maintenance cost across updates.

Each insertion or deletion performs a small amount of restructuring.

The result is predictable performance.

Height Bounds

The precise height bound depends on the balancing scheme.

Typical guarantees include:

Structure Height Bound
AVL Tree Very strict
Red-Black Tree Less strict
Treap Probabilistic
B-Tree Disk optimized

All maintain:

O(log n)

height.

The difference lies primarily in constants and implementation complexity.

Complexity Analysis

For a balanced tree containing:

n elements

height remains:

O(log n)

Therefore:

Operation Complexity
Search O(log n)
Insert O(log n)
Delete O(log n)
Lower bound O(log n)
Upper bound O(log n)
Predecessor O(log n)
Successor O(log n)

These guarantees make balanced trees one of the most versatile ordered data structures.

Common Mistakes

A common misconception is that binary search trees are automatically efficient.

Only balanced trees provide logarithmic guarantees.

Another mistake is focusing exclusively on node ordering while ignoring height. Correct ordering alone does not imply good performance.

A third mistake is assuming balance means symmetry. Many balanced trees appear visually uneven while still maintaining excellent height bounds.

Finally, some developers overestimate the cost of balancing. Rotations modify only a small number of nodes and are surprisingly inexpensive compared to the performance benefits they provide.

Takeaway

Balanced trees preserve the ordering advantages of binary search trees while preventing pathological shapes that degrade performance. Through local restructuring operations such as rotations, they maintain logarithmic height despite arbitrary insertion and deletion patterns. This combination of ordered access, efficient updates, and predictable performance makes balanced trees one of the foundational data structures of modern software systems. The next sections explore specific balancing strategies, beginning with the simplest and most rigorous approach: AVL trees.