12.10 Hopcroft-Karp Algorithm

Problem

You need to find a maximum matching in a bipartite graph.

A generic max-flow algorithm works, but bipartite matching has extra structure. Every augmenting path alternates between unmatched and matched edges. Instead of finding one augmenting path at a time, we can find many shortest augmenting paths in phases.

This is the same performance idea used by Dinic: build layers, then augment through many disjoint paths before recomputing the layers.

Hopcroft-Karp applies that idea directly to bipartite matching.

Solution

Use BFS to build distance layers from all unmatched left-side vertices. Then use DFS to find a maximal set of shortest augmenting paths.

Each phase does two things:

BFS finds the length of the shortest augmenting paths.
DFS augments along as many vertex-disjoint shortest augmenting paths as possible.

The algorithm stops when BFS cannot reach any unmatched right-side vertex.

The running time is:

O(E√V)

where V is the total number of vertices and E is the number of bipartite edges.

Matching Model

Let the graph be:

G = (L, R, E)

where:

L is the left side
R is the right side
E contains edges from L to R

A matching stores two maps:

pairLeft[u]  = matched right vertex for u, or NIL
pairRight[v] = matched left vertex for v, or NIL

A vertex is free if its pair is NIL.

Augmenting Paths

An augmenting path starts at a free vertex on the left and ends at a free vertex on the right.

The path alternates:

unmatched edge
matched edge
unmatched edge
matched edge
...
unmatched edge

Example:

u1 -> v1 -> u2 -> v2 -> u3 -> v3

where:

u1 is free
v3 is free
u1 -> v1 is unmatched
u2 -> v1 is matched
u2 -> v2 is unmatched
u3 -> v2 is matched
u3 -> v3 is unmatched

After flipping the path, the matching size increases by one.

Flipping means:

unmatched edges become matched
matched edges become unmatched

Why One Path at a Time Is Slow

A naive augmenting-path algorithm repeatedly finds a single path and flips it.

find path
flip path

find path
flip path

find path
flip path

This can take:

O(VE)

time.

Hopcroft-Karp improves this by processing many shortest augmenting paths in one phase.

BFS layers
DFS many paths
flip many paths

BFS layers
DFS many paths
flip many paths

The number of phases is bounded by O(√V).

BFS Layering

BFS starts from every free left-side vertex.

For each free u ∈ L:

dist[u] = 0

For matched left vertices:

dist[u] = ∞

During BFS, from a left vertex u, inspect each edge:

u -> v

If v is matched to some left vertex u2, continue to u2:

u -> v -> u2

and set:

dist[u2] = dist[u] + 1

If v is free, BFS has found that an augmenting path exists at this layer depth.

DFS Search

After BFS, DFS tries to find augmenting paths that respect the BFS distances.

From left vertex u, for each neighbor v:

If v is free, the path can end.
If v is matched to u2, DFS may continue only if:

dist[u2] = dist[u] + 1

This rule ensures DFS only follows shortest augmenting paths.

When DFS succeeds, update:

pairLeft[u] = v
pairRight[v] = u

Recursive updates flip the entire augmenting path.

Example

Left side:

L = {A, B, C}

Right side:

R = {1, 2, 3}

Edges:

Left	Right
A	1
A	2
B	2
B	3
C	1
C	3

Initially all vertices are free.

BFS starts with:

dist[A] = 0
dist[B] = 0
dist[C] = 0

DFS may find these augmenting paths:

A -> 2
B -> 3
C -> 1

After flipping them, the matching is:

Left	Right
A	2
B	3
C	1

The matching has size 3, which is perfect.

Less Trivial Example

Suppose the current matching is:

Left	Right
A	1
B	2

and C is free.

Edges:

Left	Right
A	1
A	3
B	1
B	2
C	2

Free left vertex:

Free right vertex:

An augmenting path is:

C -> 2 -> B -> 1 -> A -> 3

Flipping gives:

Left	Right
C	2
B	1
A	3

The matching size increases from 2 to 3.

This example shows why augmenting paths must be allowed to rearrange earlier choices.

Pseudocode

HopcroftKarp(G):

    matching = empty

    while BFS builds layers:

        for each free u in L:

            if DFS(u):
                matching += 1

    return matching

BFS computes the shortest augmenting path structure.

DFS extracts as many compatible augmenting paths as possible from that structure.

BFS Pseudocode

BFS():

    queue = empty

    for each u in L:
        if pairLeft[u] == NIL:
            dist[u] = 0
            push u
        else:
            dist[u] = INF

    found = false

    while queue not empty:

        u = pop queue

        for each v in adj[u]:

            u2 = pairRight[v]

            if u2 == NIL:
                found = true

            else if dist[u2] == INF:
                dist[u2] = dist[u] + 1
                push u2

    return found

DFS Pseudocode

DFS(u):

    for each v in adj[u]:

        u2 = pairRight[v]

        if u2 == NIL
           or dist[u2] == dist[u] + 1 and DFS(u2):

            pairLeft[u] = v
            pairRight[v] = u
            return true

    dist[u] = INF
    return false

The line:

dist[u] = INF

is a pruning step. It prevents repeated failed searches from revisiting the same dead vertex during the phase.

Implementation Notes

Use integer IDs for both sides. Store left vertices from 0 to nLeft - 1 and right vertices from 0 to nRight - 1.

Use adjacency lists only from left to right:

adj[u] = list of right vertices reachable from u

You do not need to store reverse adjacency. The matching maps provide the reverse traversal:

right vertex -> matched left vertex

This keeps the implementation compact.

Complexity

Each BFS phase scans all edges:

O(E)

Each DFS phase also scans all edges amortized:

O(E)

The number of phases is:

O(√V)

Therefore:

O(E√V)

This is the standard bound for maximum bipartite matching.

Comparison with Flow

Method	Complexity	Notes
Generic Ford-Fulkerson	O(EF)	Simple, depends on matching size
Edmonds-Karp flow	O(VE²)	Easy but usually slow
Dinic on unit network	O(E√V)	Strong general approach
Hopcroft-Karp	O(E√V)	Specialized and compact

Hopcroft-Karp is usually the preferred direct algorithm when the problem is exactly maximum bipartite matching.

Common Mistakes

Starting BFS from one free vertex

BFS must start from all free left vertices at once. This finds the global shortest augmenting path length.

Allowing DFS to use non-layer edges

DFS must respect the distances computed by BFS. Otherwise the phase may use longer paths and lose the performance guarantee.

Forgetting to flip recursively

When DFS succeeds through a matched vertex, every edge along the recursive path must be updated.

Confusing left and right IDs

Keep separate arrays for left and right sides. Do not store both in one array unless you carefully offset the indices.

Reusing distances across phases

Distances become invalid after augmentation. Recompute BFS layers every phase.

Discussion

Hopcroft-Karp is best understood as Dinic specialized to bipartite matching. The BFS phase identifies shortest augmenting paths. The DFS phase finds a maximal collection of them. After one phase, all augmenting paths of that shortest length are eliminated.

This batching is the key improvement. It avoids paying for a full graph search after every single augmentation and gives the algorithm its O(E√V) bound.

Takeaway

Hopcroft-Karp computes maximum bipartite matching by repeatedly finding many shortest augmenting paths at once. It uses BFS to build layers and DFS to flip a maximal set of compatible paths.

For pure bipartite matching, it is usually simpler than building a full flow network and faster than generic max-flow algorithms.