Chapter 26. Trees

A tree is a connected graph with no cycles.

This definition is short, but it describes one of the most important graph classes. Trees appear in pure graph theory, algorithms, data structures, networks, logic, chemistry, parsing, file systems, and hierarchical models. They are useful because they combine two strong properties: every vertex is reachable from every other vertex, and there is no circular redundancy.

Equivalently, a tree is an undirected graph in which every two distinct vertices are joined by exactly one simple path. This unique-path property is one of the main reasons trees are easy to reason about. General graph paths may branch, repeat, and form cycles. In a tree, there is only one route between two vertices.

26.1 Definition

Let

G = (V,E)

be an undirected graph, where $V$ is the set of vertices and $E$ is the set of edges.

The graph $G$ is called a tree if it satisfies both conditions:

$G$ is connected.
$G$ contains no cycle.

A graph with no cycles is called acyclic. Thus a tree is a connected acyclic graph.

A graph that is acyclic but not necessarily connected is called a forest. Each connected component of a forest is a tree.

26.2 Examples

The following graph is a tree:

V = \{1,2,3,4,5\}

E = \{\{1,2\}, \{1,3\}, \{3,4\}, \{3,5\}\}.

It is connected, since every vertex can be reached from every other vertex. It has no cycle, since no path returns to its starting vertex without repeating an edge or vertex.

The following graph is not a tree:

V = \{1,2,3\}

E = \{\{1,2\}, \{2,3\}, \{3,1\}\}.

It is connected, but it contains a cycle:

1,2,3,1.

The following graph is also not a tree:

V = \{1,2,3,4\}

E = \{\{1,2\}, \{3,4\}\}.

It has no cycle, but it is disconnected. It is a forest with two components.

26.3 Basic Terminology

A vertex of degree $1$ in a tree is called a leaf. A vertex of degree greater than $1$ is often called an internal vertex, especially when the tree is viewed as a hierarchical structure.

A path graph is a tree in which the vertices can be arranged in a single line.

A star graph is a tree with one central vertex adjacent to every other vertex. If it has $n$ vertices, then the central vertex has degree $n-1$ , and all other vertices are leaves.

A subtree is a subgraph that is itself a tree.

26.4 The Unique Path Property

One of the fundamental characterizations of a tree is the following theorem.

Theorem 26.1

Let $G$ be an undirected graph. Then $G$ is a tree if and only if every two distinct vertices of $G$ are joined by exactly one simple path.

Proof

Suppose first that $G$ is a tree. Since $G$ is connected, every two vertices are joined by at least one path.

It remains to show that there cannot be two different simple paths between the same two vertices. Suppose there were two different simple paths from $u$ to $v$ . Follow the first path from $u$ until it first differs from the second path. Since both paths eventually reach $v$ , the two paths must later meet again. The two different portions between the first separation point and the first meeting point form a cycle. This contradicts the assumption that $G$ is acyclic.

Therefore the path between any two vertices is unique.

Conversely, suppose every two distinct vertices of $G$ are joined by exactly one simple path. Then $G$ is connected by definition. If $G$ contained a cycle, then any two vertices on that cycle would be joined by two different simple paths along the two directions around the cycle. This contradicts uniqueness.

Hence $G$ is connected and acyclic. Therefore $G$ is a tree.

26.5 Edges in a Finite Tree

A finite tree has one fewer edge than vertex.

Theorem 26.2

If $T$ is a tree with $n$ vertices, then

|E(T)| = n - 1.

Proof

We prove the result by induction on $n$ .

For $n = 1$ , the tree has one vertex and no edges. Thus

|E(T)| = 0 = 1 - 1.

Assume the statement holds for every tree with $n-1$ vertices. Let $T$ be a tree with $n$ vertices, where $n \geq 2$ .

Every finite tree with at least two vertices has a leaf. Let $v$ be a leaf, and let $e$ be the unique edge incident with $v$ . Remove $v$ and $e$ . The remaining graph is still connected and acyclic, so it is a tree with $n-1$ vertices.

By the induction hypothesis, the remaining tree has

(n-1)-1 = n-2

edges. Adding back the removed edge gives

n-2+1 = n-1

edges.

Thus every tree with $n$ vertices has $n-1$ edges.

26.6 Leaves

Leaves are unavoidable in finite trees.

Theorem 26.3

Every finite tree with at least two vertices has at least two leaves.

Proof

Let $T$ be a finite tree with at least two vertices. Choose a longest simple path in $T$ :

v_0, v_1, \ldots, v_k.

The endpoint $v_0$ must have degree $1$ . If it had another neighbor not equal to $v_1$ , then the path could be extended, unless that neighbor already appeared on the path. If the neighbor already appeared on the path, a cycle would occur. Both cases are impossible.

Thus $v_0$ is a leaf. The same argument applies to $v_k$ . Therefore $T$ has at least two leaves.

This theorem is often used in induction proofs about trees. One removes a leaf, proves a statement about the smaller tree, and then adds the leaf back.

26.7 Minimal Connectivity

Trees are minimally connected graphs.

This means a tree is connected, but removing any edge disconnects it.

Theorem 26.4

If $T$ is a tree and $e$ is an edge of $T$ , then $T-e$ is disconnected.

Proof

Let

e = \{u,v\}.

Since $T$ is a tree, there is a unique path from $u$ to $v$ . That path is the single edge $e$ . If $e$ is removed, there is no remaining path from $u$ to $v$ . Hence the graph becomes disconnected.

This property says that every edge of a tree is essential for connectivity.

26.8 Maximal Acyclicity

Trees are also maximally acyclic.

This means a tree has no cycle, but adding any new edge creates a cycle.

Theorem 26.5

Let $T$ be a tree. If $u$ and $v$ are nonadjacent vertices of $T$ , then adding the edge $\{u,v\}$ creates exactly one cycle.

Proof

Since $T$ is connected, there is a unique simple path from $u$ to $v$ . Add the new edge $\{u,v\}$ . The old path from $u$ to $v$ , together with the new edge, forms a cycle.

There cannot be another cycle using the new edge, since that would require another distinct path from $u$ to $v$ in $T$ . There cannot be a cycle avoiding the new edge, since $T$ itself has no cycles.

Therefore exactly one cycle is created.

26.9 Equivalent Characterizations

For a finite undirected graph $G$ with $n$ vertices, the following conditions are equivalent:

Condition	Meaning
$G$ is connected and acyclic	Definition of a tree
Every two vertices are joined by a unique simple path	Unique route property
$G$ is connected and has $n-1$ edges	Minimal connected graph
$G$ is acyclic and has $n-1$ edges	Maximal acyclic graph
Removing any edge disconnects $G$	Every edge is a bridge
Adding any missing edge creates a cycle	No more edges can be added safely

These equivalent forms let us recognize trees from different kinds of information. Sometimes connectivity and absence of cycles are easiest to check. Sometimes the edge count is easiest. Sometimes the unique-path property is the most useful.

26.10 Rooted and Unrooted Trees

A tree is unrooted when no vertex has been singled out as special.

A rooted tree is a tree with one distinguished vertex called the root. Once a root is chosen, the tree obtains a hierarchical structure.

If $r$ is the root and $v$ is another vertex, the unique path from $r$ to $v$ defines the position of $v$ in the hierarchy.

In a rooted tree:

Term	Meaning
Root	Distinguished starting vertex
Parent	Previous vertex on the path to the root
Child	Adjacent vertex farther from the root
Ancestor	Vertex lying on the path from the root
Descendant	Vertex below another vertex in the rooted structure
Leaf	Vertex with no children
Depth	Distance from the root
Height	Maximum depth of a vertex

Rooted trees are central in computer science. They model syntax trees, decision trees, search trees, directory trees, and hierarchical indexes.

26.11 Trees and Recursion

Trees naturally support recursive definitions.

A rooted tree consists of a root together with zero or more subtrees attached to the root. Each child of the root is the root of a smaller rooted tree.

This recursive structure explains why many tree algorithms are recursive. To process a tree, one processes the root and then processes each subtree.

For example, the size of a rooted tree can be defined recursively:

\operatorname{size}(T) = 1 + \sum_{i=1}^{k} \operatorname{size}(T_i),

where $T_1,\ldots,T_k$ are the subtrees rooted at the children of the root.

The height can also be defined recursively:

\operatorname{height}(T) = \begin{cases} 0, & \text{if the root is a leaf},\\ 1 + \max_i \operatorname{height}(T_i), & \text{otherwise}. \end{cases}

26.12 Spanning Trees

Let $G$ be a connected graph. A spanning tree of $G$ is a subgraph $T$ such that:

$T$ contains every vertex of $G$ .
$T$ is a tree.
Every edge of $T$ is an edge of $G$ .

A spanning tree keeps all vertices but removes enough edges to destroy all cycles.

Every finite connected graph has a spanning tree. One way to obtain one is to repeatedly remove an edge from a cycle. Removing such an edge does not disconnect the graph. Since the graph is finite, this process eventually stops. The final graph is connected and acyclic, so it is a spanning tree.

Spanning trees are important because they give cycle-free skeletons of connected graphs. They appear in network design, routing, clustering, and optimization.

26.13 Trees in Algorithms

Trees are among the most important structures in graph algorithms.

Breadth-first search builds a BFS tree. This tree records shortest-path distances from a starting vertex in an unweighted graph.

Depth-first search builds a DFS tree. This tree records the recursive structure of graph exploration.

Minimum spanning tree algorithms, such as Kruskal’s algorithm and Prim’s algorithm, find a spanning tree of minimum total edge weight in a connected weighted graph.

Many graph problems become easier on trees. Since trees have no cycles, dynamic programming can often solve problems on trees more efficiently than on general graphs.

Examples include:

Problem	Tree advantage
Connectivity	Unique path between vertices
Shortest path	Only one simple path exists
Dynamic programming	Subproblems separate at vertices
Matching	Can be solved recursively
Independent set	Can be solved by parent-child cases
Network design	Minimal connected backbone

26.14 Trees as Minimal Networks

A tree is the smallest connected structure on a given set of vertices.

If a connected graph has $n$ vertices, it must have at least $n-1$ edges. A tree reaches this lower bound exactly.

This makes trees natural models for minimal networks. A communication network that connects all sites with no redundant link has the structure of a tree. If any link fails, the network disconnects. Adding one extra link creates a cycle and introduces redundancy.

Thus trees represent the boundary between disconnected and cyclic structure.

They are connected enough to hold the graph together, but sparse enough to contain no alternate route.

26.15 Summary

A tree is a connected acyclic graph. This simple definition leads to several equivalent descriptions.

A finite graph is a tree precisely when it has unique simple paths between vertices, or when it is connected with $n-1$ edges, or when it is acyclic with $n-1$ edges.

Trees are minimally connected and maximally acyclic. Removing any edge disconnects them. Adding any missing edge creates a cycle.

Their importance comes from this balance. They are simple enough to support induction, recursion, and efficient algorithms, yet rich enough to model hierarchy, dependency, search, and minimal connectivity.