11.1 Spanning Trees

Problem

You are given a connected undirected graph. You need a subset of edges that keeps every vertex reachable from every other vertex while using the smallest possible number of edges.

This problem appears in communication networks, road systems, electrical distribution, dependency management, clustering, and many optimization problems. Before searching for the minimum-cost solution, we must understand the structure called a spanning tree.

Solution

A spanning tree of a connected undirected graph is a subgraph that:

Contains every vertex of the original graph.
Is connected.
Contains no cycles.

If a graph contains (n) vertices, every spanning tree contains exactly (n - 1) edges.

Consider the graph:

A ----- B
| \     |
|  \    |
|   \   |
C ----- D

The graph contains five edges:

(A,B)
(A,C)
(A,D)
(B,D)
(C,D)

One possible spanning tree is:

A ----- B

|
|
C ----- D

Using edges:

(A,B)
(A,C)
(C,D)

All vertices remain connected.

No cycle exists.

The graph contains four vertices and three edges.

Since:

$$ 4 - 1 = 3 $$

this satisfies the spanning tree property.

Understanding the Structure

A spanning tree removes redundancy while preserving reachability.

The original graph may contain many alternative paths:

A -> B -> D
A -> D
A -> C -> D

A spanning tree keeps only enough edges to maintain connectivity.

Every remaining edge becomes essential.

Removing any edge disconnects the graph.

Adding any new edge creates a cycle.

These observations lead to two important characterizations.

A graph is a tree if and only if:

It is connected and acyclic.

A graph is a tree if and only if:

It is connected and contains exactly (n - 1) edges.

Both definitions are used constantly in graph proofs.

Why Exactly n - 1 Edges?

This property is one of the most important facts in graph theory.

Start with one vertex:

Edges required:

Add another vertex:

A ----- B

Edges required:

Add a third vertex:

A ----- B
|
|
C

Edges required:

Every newly added vertex requires exactly one edge to connect it to the existing structure.

After adding all vertices:

$$ \text{edges} = n - 1 $$

Any fewer edges disconnect the graph.

Any more edges create a cycle.

This observation allows immediate validation of many graph algorithms.

Detecting Whether a Graph Is a Tree

A connected graph is a tree when:

$$ |E| = |V| - 1 $$

and the graph is connected.

A common algorithm is:

Run DFS or BFS.
Count visited vertices.
Count edges.

If:

visited_vertices = n

and:

edges = n - 1

the graph is a tree.

Example

Vertices:

0 1 2 3 4

Edges:

0-1
0-2
2-3
2-4

Count:

Vertices = 5
Edges = 4

Since:

$$ 5 - 1 = 4 $$

and all vertices are reachable, the graph is a tree.

Building a Spanning Tree with DFS

One simple method uses depth-first search.

Whenever DFS discovers a new vertex, add the traversed edge to the tree.

Pseudocode

DFS(v):
    mark v visited

    for each neighbor u:
        if u not visited:
            add edge (v,u) to tree
            DFS(u)

Every discovered edge becomes a tree edge.

Back edges are ignored.

The resulting set of edges forms a spanning tree.

Example

Graph:

0 -- 1
|  / |
| /  |
2 -- 3

DFS from 0:

0 -> 1
1 -> 3
3 -> 2

Tree edges:

(0,1)
(1,3)
(3,2)

The cycle edges are discarded automatically.

Building a Spanning Tree with BFS

Breadth-first search can also generate a spanning tree.

Whenever BFS first discovers a vertex, store the edge that discovered it.

Pseudocode

BFS(start):
    push start

    while queue not empty:
        v = pop

        for each neighbor u:
            if u not visited:
                add edge (v,u)
                push u

The resulting structure is called a BFS spanning tree.

Unlike DFS trees, BFS trees preserve shortest-path levels from the source.

Applications

Network Design

A spanning tree removes redundant links while maintaining connectivity.

Broadcast Systems

Many communication protocols build spanning trees to avoid packet loops.

Clustering

Hierarchical clustering algorithms often start from minimum spanning trees.

Graph Compression

A spanning tree provides a compact backbone representation of a graph.

Routing

Many routing protocols maintain spanning-tree-like structures for efficient forwarding.

Complexity

Using DFS:

Operation	Complexity
Build spanning tree	O(V + E)
Memory	O(V)

Using BFS:

Operation	Complexity
Build spanning tree	O(V + E)
Memory	O(V)

Both algorithms visit each vertex and edge at most once.

Common Mistakes

Assuming Every Connected Graph Has One Spanning Tree

A connected graph can have many spanning trees.

Different DFS orders may produce different trees.

Forgetting Connectivity

A disconnected graph has no spanning tree.

Instead, it has a spanning forest.

Confusing Trees with Spanning Trees

A tree is a graph.

A spanning tree is a tree derived from another graph.

Counting Edges Incorrectly

For undirected graphs:

adjacency list size / 2

must be used when counting edges.

Recipe

When facing a connectivity problem:

Check whether the graph is connected.
Remember that every spanning tree contains exactly (n - 1) edges.
Use DFS or BFS to construct a spanning tree.
Remove cycle edges automatically.
Preserve only the edges required for connectivity.

This simple structure becomes the foundation for minimum spanning trees, Kruskal's algorithm, Prim's algorithm, network design, clustering, and many advanced graph optimization techniques explored throughout the rest of this chapter.