12.10 Hopcroft-Karp Algorithm
You need to find a maximum matching in a bipartite graph.
12.10 Hopcroft-Karp Algorithm
Problem
You need to find a maximum matching in a bipartite graph.
A generic max-flow algorithm works, but bipartite matching has extra structure. Every augmenting path alternates between unmatched and matched edges. Instead of finding one augmenting path at a time, we can find many shortest augmenting paths in phases.
This is the same performance idea used by Dinic: build layers, then augment through many disjoint paths before recomputing the layers.
Hopcroft-Karp applies that idea directly to bipartite matching.
Solution
Use BFS to build distance layers from all unmatched left-side vertices. Then use DFS to find a maximal set of shortest augmenting paths.
Each phase does two things:
- BFS finds the length of the shortest augmenting paths.
- DFS augments along as many vertex-disjoint shortest augmenting paths as possible.
The algorithm stops when BFS cannot reach any unmatched right-side vertex.
The running time is:
O(E√V)
where V is the total number of vertices and E is the number of bipartite edges.
Matching Model
Let the graph be:
G = (L, R, E)
where:
Lis the left sideRis the right sideEcontains edges fromLtoR
A matching stores two maps:
pairLeft[u] = matched right vertex for u, or NIL
pairRight[v] = matched left vertex for v, or NIL
A vertex is free if its pair is NIL.
Augmenting Paths
An augmenting path starts at a free vertex on the left and ends at a free vertex on the right.
The path alternates:
unmatched edge
matched edge
unmatched edge
matched edge
...
unmatched edge
Example:
u1 -> v1 -> u2 -> v2 -> u3 -> v3
where:
u1is freev3is freeu1 -> v1is unmatchedu2 -> v1is matchedu2 -> v2is unmatchedu3 -> v2is matchedu3 -> v3is unmatched
After flipping the path, the matching size increases by one.
Flipping means:
- unmatched edges become matched
- matched edges become unmatched
Why One Path at a Time Is Slow
A naive augmenting-path algorithm repeatedly finds a single path and flips it.
find path
flip path
find path
flip path
find path
flip path
This can take:
O(VE)
time.
Hopcroft-Karp improves this by processing many shortest augmenting paths in one phase.
BFS layers
DFS many paths
flip many paths
BFS layers
DFS many paths
flip many paths
The number of phases is bounded by O(√V).
BFS Layering
BFS starts from every free left-side vertex.
For each free u ∈ L:
dist[u] = 0
For matched left vertices:
dist[u] = ∞
During BFS, from a left vertex u, inspect each edge:
u -> v
If v is matched to some left vertex u2, continue to u2:
u -> v -> u2
and set:
dist[u2] = dist[u] + 1
If v is free, BFS has found that an augmenting path exists at this layer depth.
DFS Search
After BFS, DFS tries to find augmenting paths that respect the BFS distances.
From left vertex u, for each neighbor v:
- If
vis free, the path can end. - If
vis matched tou2, DFS may continue only if:
dist[u2] = dist[u] + 1
This rule ensures DFS only follows shortest augmenting paths.
When DFS succeeds, update:
pairLeft[u] = v
pairRight[v] = u
Recursive updates flip the entire augmenting path.
Example
Left side:
L = {A, B, C}
Right side:
R = {1, 2, 3}
Edges:
| Left | Right |
|---|---|
| A | 1 |
| A | 2 |
| B | 2 |
| B | 3 |
| C | 1 |
| C | 3 |
Initially all vertices are free.
BFS starts with:
dist[A] = 0
dist[B] = 0
dist[C] = 0
DFS may find these augmenting paths:
A -> 2
B -> 3
C -> 1
After flipping them, the matching is:
| Left | Right |
|---|---|
| A | 2 |
| B | 3 |
| C | 1 |
The matching has size 3, which is perfect.
Less Trivial Example
Suppose the current matching is:
| Left | Right |
|---|---|
| A | 1 |
| B | 2 |
and C is free.
Edges:
| Left | Right |
|---|---|
| A | 1 |
| A | 3 |
| B | 1 |
| B | 2 |
| C | 2 |
Free left vertex:
C
Free right vertex:
3
An augmenting path is:
C -> 2 -> B -> 1 -> A -> 3
Flipping gives:
| Left | Right |
|---|---|
| C | 2 |
| B | 1 |
| A | 3 |
The matching size increases from 2 to 3.
This example shows why augmenting paths must be allowed to rearrange earlier choices.
Pseudocode
HopcroftKarp(G):
matching = empty
while BFS builds layers:
for each free u in L:
if DFS(u):
matching += 1
return matching
BFS computes the shortest augmenting path structure.
DFS extracts as many compatible augmenting paths as possible from that structure.
BFS Pseudocode
BFS():
queue = empty
for each u in L:
if pairLeft[u] == NIL:
dist[u] = 0
push u
else:
dist[u] = INF
found = false
while queue not empty:
u = pop queue
for each v in adj[u]:
u2 = pairRight[v]
if u2 == NIL:
found = true
else if dist[u2] == INF:
dist[u2] = dist[u] + 1
push u2
return found
DFS Pseudocode
DFS(u):
for each v in adj[u]:
u2 = pairRight[v]
if u2 == NIL
or dist[u2] == dist[u] + 1 and DFS(u2):
pairLeft[u] = v
pairRight[v] = u
return true
dist[u] = INF
return false
The line:
dist[u] = INF
is a pruning step. It prevents repeated failed searches from revisiting the same dead vertex during the phase.
Implementation Notes
Use integer IDs for both sides. Store left vertices from 0 to nLeft - 1 and right vertices from 0 to nRight - 1.
Use adjacency lists only from left to right:
adj[u] = list of right vertices reachable from u
You do not need to store reverse adjacency. The matching maps provide the reverse traversal:
right vertex -> matched left vertex
This keeps the implementation compact.
Complexity
Each BFS phase scans all edges:
O(E)
Each DFS phase also scans all edges amortized:
O(E)
The number of phases is:
O(√V)
Therefore:
O(E√V)
This is the standard bound for maximum bipartite matching.
Comparison with Flow
| Method | Complexity | Notes |
|---|---|---|
| Generic Ford-Fulkerson | O(EF) | Simple, depends on matching size |
| Edmonds-Karp flow | O(VE²) | Easy but usually slow |
| Dinic on unit network | O(E√V) | Strong general approach |
| Hopcroft-Karp | O(E√V) | Specialized and compact |
Hopcroft-Karp is usually the preferred direct algorithm when the problem is exactly maximum bipartite matching.
Common Mistakes
Starting BFS from one free vertex
BFS must start from all free left vertices at once. This finds the global shortest augmenting path length.
Allowing DFS to use non-layer edges
DFS must respect the distances computed by BFS. Otherwise the phase may use longer paths and lose the performance guarantee.
Forgetting to flip recursively
When DFS succeeds through a matched vertex, every edge along the recursive path must be updated.
Confusing left and right IDs
Keep separate arrays for left and right sides. Do not store both in one array unless you carefully offset the indices.
Reusing distances across phases
Distances become invalid after augmentation. Recompute BFS layers every phase.
Discussion
Hopcroft-Karp is best understood as Dinic specialized to bipartite matching. The BFS phase identifies shortest augmenting paths. The DFS phase finds a maximal collection of them. After one phase, all augmenting paths of that shortest length are eliminated.
This batching is the key improvement. It avoids paying for a full graph search after every single augmentation and gives the algorithm its O(E√V) bound.
Takeaway
Hopcroft-Karp computes maximum bipartite matching by repeatedly finding many shortest augmenting paths at once. It uses BFS to build layers and DFS to flip a maximal set of compatible paths.
For pure bipartite matching, it is usually simpler than building a full flow network and faster than generic max-flow algorithms.