11.5 Prim's Algorithm
Kruskal's algorithm builds a minimum spanning tree by selecting edges globally.
11.5 Prim's Algorithm
Problem
Kruskal's algorithm builds a minimum spanning tree by selecting edges globally. It sorts every edge in the graph and repeatedly connects separate components.
There is another approach.
Instead of constructing many small trees and merging them, we can start with a single vertex and grow one tree until it spans the entire graph.
This idea leads to Prim's algorithm.
Like Kruskal, Prim produces a minimum spanning tree. Unlike Kruskal, it expands outward from an existing tree and always chooses the cheapest edge leaving that tree.
Solution
Prim's algorithm maintains:
- A set of vertices already in the tree.
- A set of vertices not yet in the tree.
- The cheapest edge crossing between the two sets.
At each step:
- Select the minimum-weight edge connecting the current tree to an outside vertex.
- Add that edge to the tree.
- Add the new vertex.
- Repeat until all vertices belong to the tree.
The cut property guarantees that each selected edge is safe.
Example
Consider:
2
A -------- B
| |
| |
4 3
| |
| |
C -------- D
1
Start from:
A
Step 1
Current tree:
A
Crossing edges:
(A,B)=2
(A,C)=4
Choose:
(A,B)=2
Tree:
A ----- B
Weight:
$$ 2 $$
Step 2
Visited:
A,B
Crossing edges:
(A,C)=4
(B,D)=3
Choose:
(B,D)=3
Tree:
A ----- B
|
|
D
Weight:
$$ 2+3=5 $$
Step 3
Visited:
A,B,D
Crossing edges:
(A,C)=4
(D,C)=1
Choose:
(D,C)=1
Final tree:
A ----- B
|
|
D
|
|
C
Total weight:
$$ 2+3+1=6 $$
All vertices are connected.
The algorithm terminates.
Growing a Single Component
A useful way to view Prim is:
One tree
that continuously expands.
Initially:
A
After several iterations:
A ---- B ---- D
|
|
C
Unlike Kruskal, multiple components never exist.
There is always exactly one connected component.
Why It Works
Suppose the current tree contains:
Visited vertices
and the remaining graph contains:
Unvisited vertices
These two groups define a cut.
Every candidate edge crosses that cut.
Prim selects the smallest crossing edge.
By the cut property:
The minimum crossing edge belongs to some minimum spanning tree.
Therefore every selected edge is safe.
Repeating this process constructs a complete minimum spanning tree.
Naïve Implementation
A direct implementation searches all crossing edges during every iteration.
Pseudocode
pick arbitrary start
while tree incomplete:
find minimum edge
leaving the tree
add edge
add vertex
Complexity
There are:
$$ V-1 $$
iterations.
Each iteration may inspect:
$$ E $$
edges.
Total:
$$ O(VE) $$
This is usually too slow.
Priority Queue Optimization
A better implementation uses a min-priority queue.
Store:
(weight, vertex)
for candidate edges.
Whenever a new vertex enters the tree:
- Examine its neighbors.
- Update best known connection costs.
- Push candidates into the priority queue.
The queue automatically provides the next cheapest edge.
Efficient Pseudocode
Prim(start):
push start with cost 0
while heap not empty:
vertex = extract_min()
if already visited:
continue
mark visited
for each neighbor:
if better edge found:
update heap
This version is used in most practical implementations.
Detailed Example
Graph:
A-B = 4
A-C = 2
B-C = 1
B-D = 5
C-D = 8
C-E = 10
D-E = 2
Start:
A
Visit A
Candidates:
B = 4
C = 2
Choose:
C = 2
Visit C
Candidates:
B = 1
D = 8
E = 10
Choose:
B = 1
Visit B
Candidates:
D = 5
Choose:
D = 5
Visit D
Candidates:
E = 2
Choose:
E = 2
Result:
A-C
C-B
B-D
D-E
Weight:
$$ 2+1+5+2=10 $$
Adjacency Matrix Version
For dense graphs, a matrix implementation can be effective.
Store:
best[v]
representing the cheapest edge connecting each vertex to the tree.
At each step:
- Scan all vertices.
- Select smallest unvisited value.
- Update neighbors.
Complexity:
$$ O(V^2) $$
No heap required.
Binary Heap Version
Using adjacency lists and a binary heap:
Complexity
Each edge may generate a heap operation.
Heap operations cost:
$$ O(\log V) $$
Total:
$$ O(E \log V) $$
This is the standard complexity quoted for Prim's algorithm.
Fibonacci Heap Version
A theoretical improvement uses Fibonacci heaps.
Complexity:
$$ O(E + V \log V) $$
Although asymptotically better, Fibonacci heaps have large constants and are rarely used in practice.
Comparison with Kruskal
| Property | Prim | Kruskal |
|---|---|---|
| Strategy | Grow one tree | Merge components |
| Main structure | Priority queue | Union-find |
| Edge ordering | Local | Global |
| Sparse graphs | Excellent | Excellent |
| Dense graphs | Often better | Usually slower |
| Natural for adjacency lists | Yes | Less so |
Both algorithms produce identical optimal costs.
The difference lies in how they discover the solution.
Sparse Graph Example
Suppose:
$$ E = O(V) $$
Then:
$$ O(E \log V) = O(V \log V) $$
Very efficient.
Dense Graph Example
Suppose:
$$ E = O(V^2) $$
Heap version:
$$ O(V^2 \log V) $$
Matrix version:
$$ O(V^2) $$
For dense graphs, the matrix implementation can outperform heap-based approaches.
Common Mistakes
Forgetting the Visited Check
The same vertex may appear multiple times in the heap.
Always verify whether it has already been added to the tree.
Updating Incorrect Keys
When a cheaper connection is found:
best[v]
must be updated.
Failure produces incorrect results.
Using Directed Graphs
Prim assumes an undirected graph.
Confusing Edge Weight with Path Cost
Prim chooses the cheapest connecting edge.
It does not compute shortest paths.
This distinction is critical.
Prim vs Dijkstra
The algorithms appear similar:
Priority queue
Visited set
Repeated extraction
However their objectives differ.
| Prim | Dijkstra |
|---|---|
| Builds MST | Builds shortest-path tree |
| Optimizes total tree cost | Optimizes source distances |
| Uses edge weight | Uses accumulated distance |
Many implementation errors arise from mixing these ideas.
Real-World Applications
Communication Networks
Building minimum-cost infrastructure.
Power Grids
Designing electrical distribution systems.
Water Distribution
Connecting facilities at minimal construction cost.
Clustering
MST-based clustering frequently uses Prim internally.
Geographic Information Systems
Constructing efficient regional networks.
Correctness Summary
Prim succeeds because:
- The current tree and remaining vertices form a cut.
- The selected edge is the cheapest crossing edge.
- The cut property guarantees safety.
- Every step preserves optimality.
- After visiting all vertices, the result is a spanning tree.
- The resulting spanning tree has minimum total weight.
Recipe
When using Prim's algorithm:
- Pick any start vertex.
- Maintain a set of visited vertices.
- Track the cheapest edge leaving the current tree.
- Use a priority queue for efficiency.
- Add one vertex at a time.
- Repeat until all vertices are included.
- Sum selected edge weights to obtain the minimum spanning tree cost.
Prim and Kruskal are the two fundamental minimum spanning tree algorithms. Together they solve most practical MST problems. The next section introduces Borůvka's algorithm, a historically important approach that repeatedly grows all components in parallel and forms the basis of several modern parallel MST algorithms.