8.17 Heavy-Light Decomposition

Many tree problems ask questions about paths.

Examples include:

Sum of values along a path
Maximum edge weight between two nodes
Minimum value between two vertices
Path updates
Network latency queries
Organizational hierarchy analysis

A subtree query is relatively easy after an Euler Tour because every subtree becomes a contiguous interval.

Path queries are more difficult.

Consider:

0
├── 1
│   ├── 3
│   └── 4
└── 2
    ├── 5
    └── 6

The path:

4 → 1 → 0 → 2 → 5

is not a contiguous interval in DFS order.

A simple flattening is no longer sufficient.

Heavy-Light Decomposition (HLD) solves this problem by decomposing the tree into a small number of chains. Once the decomposition is built, path queries become a sequence of interval queries that can be handled efficiently by Segment Trees or Fenwick Trees.

Heavy-Light Decomposition is one of the most important advanced tree techniques because it transforms difficult path problems into familiar range-query problems.

Problem

You need efficient path queries or path updates on a tree.

Examples:

Sum(u,v)
Max(u,v)
Min(u,v)
Update(u,v)

Repeated traversal of the path is too expensive.

Solution

Divide the tree into heavy and light edges.

For every node, select the child with the largest subtree.

That edge becomes:

Heavy

All remaining child edges become:

Light

Consider:

          0
        /   \\n       1     2
      / \   / \\n     3   4 5   6

Subtree sizes:

One possible decomposition:

0 === 1 === 3

2 === 5

4

6

Heavy edges are shown with:

===

These edges form chains.

The remarkable property is that any root-to-node path crosses only a logarithmic number of light edges.

This makes efficient path processing possible.

Discussion

The entire technique rests on one observation.

Suppose:

child size ≤ parent size / 2

for every light edge.

Crossing a light edge therefore reduces the remaining subtree size by at least half.

Example:

1000
→ 500
→ 250
→ 125
→ 62
...

After only:

O(log n)

light-edge crossings, the subtree size reaches one.

This implies:

Any path contains at most O(log n) light edges.

That property drives the complexity analysis of HLD.

First DFS: Subtree Sizes

The decomposition begins by computing subtree sizes.

func dfsSize(
    tree [][]int,
    node int,
    parent int,
    size []int,
    heavy []int,
) {
    size[node] = 1
    heavy[node] = -1

    bestSize := 0

    for _, child := range tree[node] {
        if child == parent {
            continue
        }

        dfsSize(
            tree,
            child,
            node,
            size,
            heavy,
        )

        size[node] += size[child]

        if size[child] > bestSize {
            bestSize = size[child]
            heavy[node] = child
        }
    }
}

After this DFS:

heavy[node]

stores the heavy child.

Second DFS: Build Chains

Assign positions in the decomposed structure.

func dfsDecompose(
    tree [][]int,
    node int,
    parent int,
    head int,
    heavy []int,
    chainHead []int,
    position []int,
) {
    chainHead[node] = head

    position[node] = currentPos
    currentPos++

    if heavy[node] != -1 {
        dfsDecompose(
            tree,
            heavy[node],
            node,
            head,
            heavy,
            chainHead,
            position,
        )
    }

    for _, child := range tree[node] {
        if child == parent ||
           child == heavy[node] {
            continue
        }

        dfsDecompose(
            tree,
            child,
            node,
            child,
            heavy,
            chainHead,
            position,
        )
    }
}

Heavy children remain in the same chain.

Light children start new chains.

Flattened Representation

After decomposition:

Node	Position
0	0
1	1
3	2
4	3
2	4
5	5
6	6

Each heavy chain becomes a contiguous interval.

This allows a Segment Tree to be built over the positions.

The tree is now represented as an array.

Path Queries

Suppose we need:

Sum(4,5)

Path:

4 → 1 → 0 → 2 → 5

The nodes belong to different chains.

Instead of traversing the path directly:

Move the deeper chain upward.
Query that chain segment.
Repeat until both nodes share a chain.

Implementation outline:

func queryPath(
    u int,
    v int,
) int

The algorithm repeatedly climbs chain heads.

Each step removes an entire chain segment.

Climbing Chains

Suppose:

u chain head = 4
v chain head = 0

If:

depth(head(u))
>
depth(head(v))

query:

head(u) → u

and move:

u = parent(head(u))

upward.

This eliminates an entire chain in one step.

The process repeats until both nodes belong to the same chain.

Final Segment

Eventually:

chain(u)
=
chain(v)

At this point:

position[u]
...
position[v]

forms a contiguous interval.

The final answer comes from a single Segment Tree query.

The path has been reduced to a small number of interval queries.

Complete Query Structure

The overall pattern:

result := identity

for chainHead[u] != chainHead[v] {

    if depth[chainHead[u]] <
       depth[chainHead[v]] {
        u, v = v, u
    }

    result =
        combine(
            result,
            segmentTree.Query(
                position[chainHead[u]],
                position[u],
            ),
        )

    u = parent[chainHead[u]]
}

Then handle the final chain.

result =
    combine(
        result,
        segmentTree.Query(
            min(position[u], position[v]),
            max(position[u], position[v]),
        ),
    )

The exact combine operation depends on the problem.

Path Maximum Queries

Suppose every node stores a weight.

Query:

Maximum value on path(u,v)

Replace:

combine = max

The Segment Tree stores maximum values.

The HLD logic remains unchanged.

Only the aggregation function changes.

Path Minimum Queries

Similarly:

combine = min

The Segment Tree stores minimum values.

Again, the decomposition itself remains identical.

This flexibility is one reason HLD is so useful.

Path Sum Queries

For sums:

combine = +

The Segment Tree stores interval sums.

This is the most common introductory example.

Many organizational and financial hierarchy problems use this formulation.

Path Updates

Suppose:

Add 5 to every node
on path(u,v)

The path is decomposed into chain intervals.

Each interval becomes:

Range Update

on a Lazy Segment Tree.

The tree traversal disappears.

Only interval updates remain.

This combination is extremely powerful.

Why HLD Works

A useful mental model is:

Tree
→
Few chains
→
Intervals

The decomposition converts arbitrary paths into a small number of contiguous segments.

Segment Trees already know how to process intervals efficiently.

HLD simply provides the mapping.

This separation of concerns makes the technique elegant.

Comparing Euler Tour and HLD

Euler Tour excels at subtree queries.

Heavy-Light Decomposition excels at path queries.

Query Type	Euler Tour	HLD
Subtree Sum	Excellent	Possible
Subtree Max	Excellent	Possible
Path Sum	Difficult	Excellent
Path Max	Difficult	Excellent
Path Update	Difficult	Excellent

Choosing the correct technique depends on the problem.

HLD and LCA

Many HLD implementations use Lowest Common Ancestor preprocessing.

Path:

u → LCA → v

can be handled naturally using decomposition.

Some implementations compute the LCA explicitly.

Others obtain the same effect while climbing chains.

Both approaches are common.

Complexity Analysis

Let:

n = number of nodes

Preprocessing:

Operation	Complexity
Subtree sizes	O(n)
Heavy selection	O(n)
Chain construction	O(n)
Total	O(n)

Queries:

Each chain jump crosses a light edge.

Maximum number of light edges:

O(log n)

Each Segment Tree query:

O(log n)

Total:

Operation	Complexity
Path query	O(log² n)
Path update	O(log² n)

Memory:

Structure	Complexity
HLD arrays	O(n)
Segment Tree	O(n)

Many optimized implementations reduce query complexity further, but O(log² n) is the standard form.

Common Mistakes

A common mistake is selecting the wrong heavy child.

The heavy child must be the child with the largest subtree.

Any other choice destroys the logarithmic guarantee.

Another mistake is assigning new chain heads incorrectly. Only light edges begin new chains.

A third mistake is forgetting to process the final chain after the climbing loop ends. Many implementations correctly process intermediate chains but omit the final interval.

Finally, it is easy to confuse:

position[node]

with:

node id

All Segment Tree operations must use decomposed positions rather than original node identifiers.

Takeaway

Heavy-Light Decomposition transforms path queries into interval queries. By dividing the tree into heavy chains and guaranteeing that any path crosses only a logarithmic number of light edges, HLD reduces complex tree operations to a small collection of Segment Tree or Fenwick Tree operations. It is one of the most important advanced tree techniques and serves as a foundation for efficient path aggregation, path updates, and many large-scale hierarchical query systems.