11.1 Spanning Trees
You are given a connected undirected graph.
11.1 Spanning Trees
Problem
You are given a connected undirected graph. You need a subset of edges that keeps every vertex reachable from every other vertex while using the smallest possible number of edges.
This problem appears in communication networks, road systems, electrical distribution, dependency management, clustering, and many optimization problems. Before searching for the minimum-cost solution, we must understand the structure called a spanning tree.
Solution
A spanning tree of a connected undirected graph is a subgraph that:
- Contains every vertex of the original graph.
- Is connected.
- Contains no cycles.
If a graph contains (n) vertices, every spanning tree contains exactly (n - 1) edges.
Consider the graph:
A ----- B
| \ |
| \ |
| \ |
C ----- D
The graph contains five edges:
(A,B)
(A,C)
(A,D)
(B,D)
(C,D)
One possible spanning tree is:
A ----- B
|
|
C ----- D
Using edges:
(A,B)
(A,C)
(C,D)
All vertices remain connected.
No cycle exists.
The graph contains four vertices and three edges.
Since:
$$ 4 - 1 = 3 $$
this satisfies the spanning tree property.
Understanding the Structure
A spanning tree removes redundancy while preserving reachability.
The original graph may contain many alternative paths:
A -> B -> D
A -> D
A -> C -> D
A spanning tree keeps only enough edges to maintain connectivity.
Every remaining edge becomes essential.
Removing any edge disconnects the graph.
Adding any new edge creates a cycle.
These observations lead to two important characterizations.
A graph is a tree if and only if:
- It is connected and acyclic.
A graph is a tree if and only if:
- It is connected and contains exactly (n - 1) edges.
Both definitions are used constantly in graph proofs.
Why Exactly n - 1 Edges?
This property is one of the most important facts in graph theory.
Start with one vertex:
A
Edges required:
0
Add another vertex:
A ----- B
Edges required:
1
Add a third vertex:
A ----- B
|
|
C
Edges required:
2
Every newly added vertex requires exactly one edge to connect it to the existing structure.
After adding all vertices:
$$ \text{edges} = n - 1 $$
Any fewer edges disconnect the graph.
Any more edges create a cycle.
This observation allows immediate validation of many graph algorithms.
Detecting Whether a Graph Is a Tree
A connected graph is a tree when:
$$ |E| = |V| - 1 $$
and the graph is connected.
A common algorithm is:
- Run DFS or BFS.
- Count visited vertices.
- Count edges.
If:
visited_vertices = n
and:
edges = n - 1
the graph is a tree.
Example
Vertices:
0 1 2 3 4
Edges:
0-1
0-2
2-3
2-4
Count:
Vertices = 5
Edges = 4
Since:
$$ 5 - 1 = 4 $$
and all vertices are reachable, the graph is a tree.
Building a Spanning Tree with DFS
One simple method uses depth-first search.
Whenever DFS discovers a new vertex, add the traversed edge to the tree.
Pseudocode
DFS(v):
mark v visited
for each neighbor u:
if u not visited:
add edge (v,u) to tree
DFS(u)
Every discovered edge becomes a tree edge.
Back edges are ignored.
The resulting set of edges forms a spanning tree.
Example
Graph:
0 -- 1
| / |
| / |
2 -- 3
DFS from 0:
0 -> 1
1 -> 3
3 -> 2
Tree edges:
(0,1)
(1,3)
(3,2)
The cycle edges are discarded automatically.
Building a Spanning Tree with BFS
Breadth-first search can also generate a spanning tree.
Whenever BFS first discovers a vertex, store the edge that discovered it.
Pseudocode
BFS(start):
push start
while queue not empty:
v = pop
for each neighbor u:
if u not visited:
add edge (v,u)
push u
The resulting structure is called a BFS spanning tree.
Unlike DFS trees, BFS trees preserve shortest-path levels from the source.
Applications
Network Design
A spanning tree removes redundant links while maintaining connectivity.
Broadcast Systems
Many communication protocols build spanning trees to avoid packet loops.
Clustering
Hierarchical clustering algorithms often start from minimum spanning trees.
Graph Compression
A spanning tree provides a compact backbone representation of a graph.
Routing
Many routing protocols maintain spanning-tree-like structures for efficient forwarding.
Complexity
Using DFS:
| Operation | Complexity |
|---|---|
| Build spanning tree | O(V + E) |
| Memory | O(V) |
Using BFS:
| Operation | Complexity |
|---|---|
| Build spanning tree | O(V + E) |
| Memory | O(V) |
Both algorithms visit each vertex and edge at most once.
Common Mistakes
Assuming Every Connected Graph Has One Spanning Tree
A connected graph can have many spanning trees.
Different DFS orders may produce different trees.
Forgetting Connectivity
A disconnected graph has no spanning tree.
Instead, it has a spanning forest.
Confusing Trees with Spanning Trees
A tree is a graph.
A spanning tree is a tree derived from another graph.
Counting Edges Incorrectly
For undirected graphs:
adjacency list size / 2
must be used when counting edges.
Recipe
When facing a connectivity problem:
- Check whether the graph is connected.
- Remember that every spanning tree contains exactly (n - 1) edges.
- Use DFS or BFS to construct a spanning tree.
- Remove cycle edges automatically.
- Preserve only the edges required for connectivity.
This simple structure becomes the foundation for minimum spanning trees, Kruskal's algorithm, Prim's algorithm, network design, clustering, and many advanced graph optimization techniques explored throughout the rest of this chapter.