8.17 Heavy-Light Decomposition
Many tree problems ask questions about paths.
8.17 Heavy-Light Decomposition
Many tree problems ask questions about paths.
Examples include:
- Sum of values along a path
- Maximum edge weight between two nodes
- Minimum value between two vertices
- Path updates
- Network latency queries
- Organizational hierarchy analysis
A subtree query is relatively easy after an Euler Tour because every subtree becomes a contiguous interval.
Path queries are more difficult.
Consider:
0
├── 1
│ ├── 3
│ └── 4
└── 2
├── 5
└── 6
The path:
4 → 1 → 0 → 2 → 5
is not a contiguous interval in DFS order.
A simple flattening is no longer sufficient.
Heavy-Light Decomposition (HLD) solves this problem by decomposing the tree into a small number of chains. Once the decomposition is built, path queries become a sequence of interval queries that can be handled efficiently by Segment Trees or Fenwick Trees.
Heavy-Light Decomposition is one of the most important advanced tree techniques because it transforms difficult path problems into familiar range-query problems.
Problem
You need efficient path queries or path updates on a tree.
Examples:
Sum(u,v)
Max(u,v)
Min(u,v)
Update(u,v)
Repeated traversal of the path is too expensive.
Solution
Divide the tree into heavy and light edges.
For every node, select the child with the largest subtree.
That edge becomes:
Heavy
All remaining child edges become:
Light
Consider:
0
/ \\n 1 2
/ \ / \\n 3 4 5 6
Subtree sizes:
0 = 7
1 = 3
2 = 3
3 = 1
4 = 1
5 = 1
6 = 1
One possible decomposition:
0 === 1 === 3
2 === 5
4
6
Heavy edges are shown with:
===
These edges form chains.
The remarkable property is that any root-to-node path crosses only a logarithmic number of light edges.
This makes efficient path processing possible.
Discussion
The entire technique rests on one observation.
Suppose:
child size ≤ parent size / 2
for every light edge.
Crossing a light edge therefore reduces the remaining subtree size by at least half.
Example:
1000
→ 500
→ 250
→ 125
→ 62
...
After only:
O(log n)
light-edge crossings, the subtree size reaches one.
This implies:
Any path contains at most O(log n) light edges.
That property drives the complexity analysis of HLD.
First DFS: Subtree Sizes
The decomposition begins by computing subtree sizes.
func dfsSize(
tree [][]int,
node int,
parent int,
size []int,
heavy []int,
) {
size[node] = 1
heavy[node] = -1
bestSize := 0
for _, child := range tree[node] {
if child == parent {
continue
}
dfsSize(
tree,
child,
node,
size,
heavy,
)
size[node] += size[child]
if size[child] > bestSize {
bestSize = size[child]
heavy[node] = child
}
}
}
After this DFS:
heavy[node]
stores the heavy child.
Second DFS: Build Chains
Assign positions in the decomposed structure.
func dfsDecompose(
tree [][]int,
node int,
parent int,
head int,
heavy []int,
chainHead []int,
position []int,
) {
chainHead[node] = head
position[node] = currentPos
currentPos++
if heavy[node] != -1 {
dfsDecompose(
tree,
heavy[node],
node,
head,
heavy,
chainHead,
position,
)
}
for _, child := range tree[node] {
if child == parent ||
child == heavy[node] {
continue
}
dfsDecompose(
tree,
child,
node,
child,
heavy,
chainHead,
position,
)
}
}
Heavy children remain in the same chain.
Light children start new chains.
Flattened Representation
After decomposition:
| Node | Position |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 3 | 2 |
| 4 | 3 |
| 2 | 4 |
| 5 | 5 |
| 6 | 6 |
Each heavy chain becomes a contiguous interval.
This allows a Segment Tree to be built over the positions.
The tree is now represented as an array.
Path Queries
Suppose we need:
Sum(4,5)
Path:
4 → 1 → 0 → 2 → 5
The nodes belong to different chains.
Instead of traversing the path directly:
- Move the deeper chain upward.
- Query that chain segment.
- Repeat until both nodes share a chain.
Implementation outline:
func queryPath(
u int,
v int,
) int
The algorithm repeatedly climbs chain heads.
Each step removes an entire chain segment.
Climbing Chains
Suppose:
u chain head = 4
v chain head = 0
If:
depth(head(u))
>
depth(head(v))
query:
head(u) → u
and move:
u = parent(head(u))
upward.
This eliminates an entire chain in one step.
The process repeats until both nodes belong to the same chain.
Final Segment
Eventually:
chain(u)
=
chain(v)
At this point:
position[u]
...
position[v]
forms a contiguous interval.
The final answer comes from a single Segment Tree query.
The path has been reduced to a small number of interval queries.
Complete Query Structure
The overall pattern:
result := identity
for chainHead[u] != chainHead[v] {
if depth[chainHead[u]] <
depth[chainHead[v]] {
u, v = v, u
}
result =
combine(
result,
segmentTree.Query(
position[chainHead[u]],
position[u],
),
)
u = parent[chainHead[u]]
}
Then handle the final chain.
result =
combine(
result,
segmentTree.Query(
min(position[u], position[v]),
max(position[u], position[v]),
),
)
The exact combine operation depends on the problem.
Path Maximum Queries
Suppose every node stores a weight.
Query:
Maximum value on path(u,v)
Replace:
combine = max
The Segment Tree stores maximum values.
The HLD logic remains unchanged.
Only the aggregation function changes.
Path Minimum Queries
Similarly:
combine = min
The Segment Tree stores minimum values.
Again, the decomposition itself remains identical.
This flexibility is one reason HLD is so useful.
Path Sum Queries
For sums:
combine = +
The Segment Tree stores interval sums.
This is the most common introductory example.
Many organizational and financial hierarchy problems use this formulation.
Path Updates
Suppose:
Add 5 to every node
on path(u,v)
The path is decomposed into chain intervals.
Each interval becomes:
Range Update
on a Lazy Segment Tree.
The tree traversal disappears.
Only interval updates remain.
This combination is extremely powerful.
Why HLD Works
A useful mental model is:
Tree
→
Few chains
→
Intervals
The decomposition converts arbitrary paths into a small number of contiguous segments.
Segment Trees already know how to process intervals efficiently.
HLD simply provides the mapping.
This separation of concerns makes the technique elegant.
Comparing Euler Tour and HLD
Euler Tour excels at subtree queries.
Heavy-Light Decomposition excels at path queries.
| Query Type | Euler Tour | HLD |
|---|---|---|
| Subtree Sum | Excellent | Possible |
| Subtree Max | Excellent | Possible |
| Path Sum | Difficult | Excellent |
| Path Max | Difficult | Excellent |
| Path Update | Difficult | Excellent |
Choosing the correct technique depends on the problem.
HLD and LCA
Many HLD implementations use Lowest Common Ancestor preprocessing.
Path:
u → LCA → v
can be handled naturally using decomposition.
Some implementations compute the LCA explicitly.
Others obtain the same effect while climbing chains.
Both approaches are common.
Complexity Analysis
Let:
n = number of nodes
Preprocessing:
| Operation | Complexity |
|---|---|
| Subtree sizes | O(n) |
| Heavy selection | O(n) |
| Chain construction | O(n) |
| Total | O(n) |
Queries:
Each chain jump crosses a light edge.
Maximum number of light edges:
O(log n)
Each Segment Tree query:
O(log n)
Total:
| Operation | Complexity |
|---|---|
| Path query | O(log² n) |
| Path update | O(log² n) |
Memory:
| Structure | Complexity |
|---|---|
| HLD arrays | O(n) |
| Segment Tree | O(n) |
Many optimized implementations reduce query complexity further, but O(log² n) is the standard form.
Common Mistakes
A common mistake is selecting the wrong heavy child.
The heavy child must be the child with the largest subtree.
Any other choice destroys the logarithmic guarantee.
Another mistake is assigning new chain heads incorrectly. Only light edges begin new chains.
A third mistake is forgetting to process the final chain after the climbing loop ends. Many implementations correctly process intermediate chains but omit the final interval.
Finally, it is easy to confuse:
position[node]
with:
node id
All Segment Tree operations must use decomposed positions rather than original node identifiers.
Takeaway
Heavy-Light Decomposition transforms path queries into interval queries. By dividing the tree into heavy chains and guaranteeing that any path crosses only a logarithmic number of light edges, HLD reduces complex tree operations to a small collection of Segment Tree or Fenwick Tree operations. It is one of the most important advanced tree techniques and serves as a foundation for efficient path aggregation, path updates, and many large-scale hierarchical query systems.