11.9 Union by Rank

Problem

Path compression makes existing trees flatter, but it does not prevent bad trees from being created.

Consider the following sequence:

union(A,B)
union(B,C)
union(C,D)
union(D,E)
union(E,F)

A naïve implementation might repeatedly attach one root beneath another:

A
|
B
|
C
|
D
|
E
|
F

Path compression eventually fixes this structure, but the first few operations may still be expensive.

Can we avoid creating deep trees in the first place?

The answer is union by rank.

Solution

Union by rank modifies the union operation.

Instead of arbitrarily attaching one root beneath another, we always attach the smaller tree beneath the larger tree.

This prevents excessive depth growth.

The term rank is a measure of tree height.

For each root we maintain:

rank[root]

Initially:

rank = 0

for every vertex.

When merging two trees:

Smaller rank becomes child of larger rank.
Equal ranks require special handling.

This simple rule dramatically improves performance.

Initialization

Suppose we have:

A B C D

Parent table:

Vertex	Parent
A	A
B	B
C	C
D	D

Rank table:

Vertex	Rank
A	0
B	0
C	0
D	0

Every vertex starts as a single-node tree.

Merging Unequal Ranks

Suppose:

A
|
B

Rank:

A = 1

and:

Rank:

C = 0

Execute:

union(A,C)

Attach smaller rank beneath larger rank:

    A
   / \\n  B   C

Ranks remain:

rank(A) = 1

No height increase occurs.

Merging Equal Ranks

Suppose:

and:

Both have:

rank = 0

Execute:

union(A,B)

Choose one root arbitrarily:

A
|
B

Now increase root rank:

rank(A) = 1

because tree height has grown.

This is the only situation where rank increases.

Example Sequence

Start:

A B C D

union(A,B)

A
|
B

Ranks:

A = 1

union(C,D)

C
|
D

Ranks:

C = 1

union(A,C)

Equal ranks:

1 and 1

Attach:

      A
     / \\n    B   C
         |
         D

Increase:

rank(A) = 2

Final height:

Without union by rank, height could have become much larger.

Why It Works

Tree height only increases when two trees of equal rank merge.

This event is relatively rare.

To reach:

rank = 1

a tree needs at least:

vertices.

To reach:

rank = 2

it needs at least:

vertices.

To reach:

rank = 3

it needs at least:

vertices.

To reach:

rank = k

it needs at least:

$$ 2^k $$

vertices.

Therefore:

$$ k \le \log_2 V $$

The depth remains logarithmic.

Size-Based Variant

Many implementations use union by size instead of rank.

Maintain:

size[root]

representing the number of vertices.

When merging:

smaller tree

becomes child of:

larger tree

Example:

size(A)=10
size(B)=3

After:

union(A,B)

attach:

B beneath A

The effect is very similar to union by rank.

Both approaches achieve excellent performance.

Pseudocode

Initialization

for each vertex:

    parent[v] = v
    rank[v] = 0

Union

union(a,b):

    rootA = find(a)
    rootB = find(b)

    if rootA == rootB:
        return

    if rank[rootA] < rank[rootB]:
        parent[rootA] = rootB

    else if rank[rootA] > rank[rootB]:
        parent[rootB] = rootA

    else:
        parent[rootB] = rootA
        rank[rootA]++

This is the standard textbook implementation.

Example from Kruskal

Consider:

A-B
B-C
C-D
D-E

Without union by rank:

A
|
B
|
C
|
D
|
E

Depth:

With union by rank:

      A
    / | \\n   B  C  D
          |
          E

Depth:

Future find operations become much faster.

Interaction with Path Compression

Union by rank prevents deep trees.

Path compression flattens existing trees.

Together:

union by rank
+
path compression

create an extraordinarily efficient structure.

Union by rank is proactive.

Path compression is reactive.

The combination is significantly stronger than either optimization alone.

Visual Comparison

Naïve Union

1
|
2
|
3
|
4
|
5
|
6
|
7

Depth:

Union by Rank

        1
      / | \\n     2  3  4
       / \\n      5   6
           \\n            7

Depth:

After Path Compression

          1
     / / /|\ \\n    2 3 4 5 6 7

Depth:

This demonstrates the power of combining both optimizations.

Theoretical Analysis

Union by rank alone:

$$ O(\log V) $$

per operation.

Path compression alone:

Very strong amortized performance.

Combined:

$$ O(\alpha(V)) $$

amortized.

The inverse Ackermann function grows so slowly that:

Vertices	α(V)
10	3
1,000	4
1,000,000	4
(10^{80})	4
Astronomically large values	< 5

For practical purposes:

Almost constant time

is a reasonable description.

Why Rank Does Not Equal Height

After path compression:

rank

no longer accurately reflects actual height.

Example:

Before compression:

A
|
B
|
C
|
D

Height:

After compression:

    A
  / | \\n B  C  D

Height:

Rank remains unchanged.

Therefore rank is only a heuristic used for balancing.

It is not necessarily the current tree height.

Common Mistakes

Increasing Rank Incorrectly

Ranks increase only when equal-rank trees merge.

Many incorrect implementations increment rank every union.

Ignoring Roots

Always compare:

find(a)
find(b)

not the original vertices.

Mixing Rank and Size

Use one strategy consistently.

Do not partially combine them.

Forgetting Path Compression

Union by rank alone is good.

Union by rank with path compression is substantially better.

Applications

Every high-performance DSU implementation uses:

Path compression
Union by rank or size

Applications include:

Kruskal's algorithm
Dynamic connectivity
Offline graph queries
Image segmentation
Clustering
Network analysis
Percolation simulation

Complexity Summary

Operation	Naïve DSU	Rank + Compression
Find	O(V)	O(α(V))
Union	O(V)	O(α(V))
Space	O(V)	O(V)

This is one of the strongest asymptotic improvements in practical algorithm design.

Recipe

When implementing production-quality union-find:

Store a parent array.
Store a rank or size array.
Use path compression inside find().
Attach smaller-rank trees beneath larger-rank trees.
Increase rank only when equal-rank trees merge.
Always operate on roots.
Expect near-constant amortized performance.

With path compression and union by rank, union-find becomes the foundational connectivity structure used throughout graph algorithms. The next section applies this optimized DSU to a powerful application: answering connectivity queries efficiently in offline graph-processing problems.