Skip to content

Chapter 26. Trees

A tree is a connected graph with no cycles.

This definition is short, but it describes one of the most important graph classes. Trees appear in pure graph theory, algorithms, data structures, networks, logic, chemistry, parsing, file systems, and hierarchical models. They are useful because they combine two strong properties: every vertex is reachable from every other vertex, and there is no circular redundancy.

Equivalently, a tree is an undirected graph in which every two distinct vertices are joined by exactly one simple path. This unique-path property is one of the main reasons trees are easy to reason about. General graph paths may branch, repeat, and form cycles. In a tree, there is only one route between two vertices.

26.1 Definition

Let

G=(V,E) G = (V,E)

be an undirected graph, where VV is the set of vertices and EE is the set of edges.

The graph GG is called a tree if it satisfies both conditions:

  1. GG is connected.
  2. GG contains no cycle.

A graph with no cycles is called acyclic. Thus a tree is a connected acyclic graph.

A graph that is acyclic but not necessarily connected is called a forest. Each connected component of a forest is a tree.

26.2 Examples

The following graph is a tree:

V={1,2,3,4,5} V = \{1,2,3,4,5\} E={{1,2},{1,3},{3,4},{3,5}}. E = \{\{1,2\}, \{1,3\}, \{3,4\}, \{3,5\}\}.

It is connected, since every vertex can be reached from every other vertex. It has no cycle, since no path returns to its starting vertex without repeating an edge or vertex.

The following graph is not a tree:

V={1,2,3} V = \{1,2,3\} E={{1,2},{2,3},{3,1}}. E = \{\{1,2\}, \{2,3\}, \{3,1\}\}.

It is connected, but it contains a cycle:

1,2,3,1. 1,2,3,1.

The following graph is also not a tree:

V={1,2,3,4} V = \{1,2,3,4\} E={{1,2},{3,4}}. E = \{\{1,2\}, \{3,4\}\}.

It has no cycle, but it is disconnected. It is a forest with two components.

26.3 Basic Terminology

A vertex of degree 11 in a tree is called a leaf. A vertex of degree greater than 11 is often called an internal vertex, especially when the tree is viewed as a hierarchical structure.

A path graph is a tree in which the vertices can be arranged in a single line.

A star graph is a tree with one central vertex adjacent to every other vertex. If it has nn vertices, then the central vertex has degree n1n-1, and all other vertices are leaves.

A subtree is a subgraph that is itself a tree.

26.4 The Unique Path Property

One of the fundamental characterizations of a tree is the following theorem.

Theorem 26.1

Let GG be an undirected graph. Then GG is a tree if and only if every two distinct vertices of GG are joined by exactly one simple path.

Proof

Suppose first that GG is a tree. Since GG is connected, every two vertices are joined by at least one path.

It remains to show that there cannot be two different simple paths between the same two vertices. Suppose there were two different simple paths from uu to vv. Follow the first path from uu until it first differs from the second path. Since both paths eventually reach vv, the two paths must later meet again. The two different portions between the first separation point and the first meeting point form a cycle. This contradicts the assumption that GG is acyclic.

Therefore the path between any two vertices is unique.

Conversely, suppose every two distinct vertices of GG are joined by exactly one simple path. Then GG is connected by definition. If GG contained a cycle, then any two vertices on that cycle would be joined by two different simple paths along the two directions around the cycle. This contradicts uniqueness.

Hence GG is connected and acyclic. Therefore GG is a tree.

26.5 Edges in a Finite Tree

A finite tree has one fewer edge than vertex.

Theorem 26.2

If TT is a tree with nn vertices, then

E(T)=n1. |E(T)| = n - 1.

Proof

We prove the result by induction on nn.

For n=1n = 1, the tree has one vertex and no edges. Thus

E(T)=0=11. |E(T)| = 0 = 1 - 1.

Assume the statement holds for every tree with n1n-1 vertices. Let TT be a tree with nn vertices, where n2n \geq 2.

Every finite tree with at least two vertices has a leaf. Let vv be a leaf, and let ee be the unique edge incident with vv. Remove vv and ee. The remaining graph is still connected and acyclic, so it is a tree with n1n-1 vertices.

By the induction hypothesis, the remaining tree has

(n1)1=n2 (n-1)-1 = n-2

edges. Adding back the removed edge gives

n2+1=n1 n-2+1 = n-1

edges.

Thus every tree with nn vertices has n1n-1 edges.

26.6 Leaves

Leaves are unavoidable in finite trees.

Theorem 26.3

Every finite tree with at least two vertices has at least two leaves.

Proof

Let TT be a finite tree with at least two vertices. Choose a longest simple path in TT:

v0,v1,,vk. v_0, v_1, \ldots, v_k.

The endpoint v0v_0 must have degree 11. If it had another neighbor not equal to v1v_1, then the path could be extended, unless that neighbor already appeared on the path. If the neighbor already appeared on the path, a cycle would occur. Both cases are impossible.

Thus v0v_0 is a leaf. The same argument applies to vkv_k. Therefore TT has at least two leaves.

This theorem is often used in induction proofs about trees. One removes a leaf, proves a statement about the smaller tree, and then adds the leaf back.

26.7 Minimal Connectivity

Trees are minimally connected graphs.

This means a tree is connected, but removing any edge disconnects it.

Theorem 26.4

If TT is a tree and ee is an edge of TT, then TeT-e is disconnected.

Proof

Let

e={u,v}. e = \{u,v\}.

Since TT is a tree, there is a unique path from uu to vv. That path is the single edge ee. If ee is removed, there is no remaining path from uu to vv. Hence the graph becomes disconnected.

This property says that every edge of a tree is essential for connectivity.

26.8 Maximal Acyclicity

Trees are also maximally acyclic.

This means a tree has no cycle, but adding any new edge creates a cycle.

Theorem 26.5

Let TT be a tree. If uu and vv are nonadjacent vertices of TT, then adding the edge {u,v}\{u,v\} creates exactly one cycle.

Proof

Since TT is connected, there is a unique simple path from uu to vv. Add the new edge {u,v}\{u,v\}. The old path from uu to vv, together with the new edge, forms a cycle.

There cannot be another cycle using the new edge, since that would require another distinct path from uu to vv in TT. There cannot be a cycle avoiding the new edge, since TT itself has no cycles.

Therefore exactly one cycle is created.

26.9 Equivalent Characterizations

For a finite undirected graph GG with nn vertices, the following conditions are equivalent:

ConditionMeaning
GG is connected and acyclicDefinition of a tree
Every two vertices are joined by a unique simple pathUnique route property
GG is connected and has n1n-1 edgesMinimal connected graph
GG is acyclic and has n1n-1 edgesMaximal acyclic graph
Removing any edge disconnects GGEvery edge is a bridge
Adding any missing edge creates a cycleNo more edges can be added safely

These equivalent forms let us recognize trees from different kinds of information. Sometimes connectivity and absence of cycles are easiest to check. Sometimes the edge count is easiest. Sometimes the unique-path property is the most useful.

26.10 Rooted and Unrooted Trees

A tree is unrooted when no vertex has been singled out as special.

A rooted tree is a tree with one distinguished vertex called the root. Once a root is chosen, the tree obtains a hierarchical structure.

If rr is the root and vv is another vertex, the unique path from rr to vv defines the position of vv in the hierarchy.

In a rooted tree:

TermMeaning
RootDistinguished starting vertex
ParentPrevious vertex on the path to the root
ChildAdjacent vertex farther from the root
AncestorVertex lying on the path from the root
DescendantVertex below another vertex in the rooted structure
LeafVertex with no children
DepthDistance from the root
HeightMaximum depth of a vertex

Rooted trees are central in computer science. They model syntax trees, decision trees, search trees, directory trees, and hierarchical indexes.

26.11 Trees and Recursion

Trees naturally support recursive definitions.

A rooted tree consists of a root together with zero or more subtrees attached to the root. Each child of the root is the root of a smaller rooted tree.

This recursive structure explains why many tree algorithms are recursive. To process a tree, one processes the root and then processes each subtree.

For example, the size of a rooted tree can be defined recursively:

size(T)=1+i=1ksize(Ti), \operatorname{size}(T) = 1 + \sum_{i=1}^{k} \operatorname{size}(T_i),

where T1,,TkT_1,\ldots,T_k are the subtrees rooted at the children of the root.

The height can also be defined recursively:

height(T)={0,if the root is a leaf,1+maxiheight(Ti),otherwise. \operatorname{height}(T) = \begin{cases} 0, & \text{if the root is a leaf},\\ 1 + \max_i \operatorname{height}(T_i), & \text{otherwise}. \end{cases}

26.12 Spanning Trees

Let GG be a connected graph. A spanning tree of GG is a subgraph TT such that:

  1. TT contains every vertex of GG.
  2. TT is a tree.
  3. Every edge of TT is an edge of GG.

A spanning tree keeps all vertices but removes enough edges to destroy all cycles.

Every finite connected graph has a spanning tree. One way to obtain one is to repeatedly remove an edge from a cycle. Removing such an edge does not disconnect the graph. Since the graph is finite, this process eventually stops. The final graph is connected and acyclic, so it is a spanning tree.

Spanning trees are important because they give cycle-free skeletons of connected graphs. They appear in network design, routing, clustering, and optimization.

26.13 Trees in Algorithms

Trees are among the most important structures in graph algorithms.

Breadth-first search builds a BFS tree. This tree records shortest-path distances from a starting vertex in an unweighted graph.

Depth-first search builds a DFS tree. This tree records the recursive structure of graph exploration.

Minimum spanning tree algorithms, such as Kruskal’s algorithm and Prim’s algorithm, find a spanning tree of minimum total edge weight in a connected weighted graph.

Many graph problems become easier on trees. Since trees have no cycles, dynamic programming can often solve problems on trees more efficiently than on general graphs.

Examples include:

ProblemTree advantage
ConnectivityUnique path between vertices
Shortest pathOnly one simple path exists
Dynamic programmingSubproblems separate at vertices
MatchingCan be solved recursively
Independent setCan be solved by parent-child cases
Network designMinimal connected backbone

26.14 Trees as Minimal Networks

A tree is the smallest connected structure on a given set of vertices.

If a connected graph has nn vertices, it must have at least n1n-1 edges. A tree reaches this lower bound exactly.

This makes trees natural models for minimal networks. A communication network that connects all sites with no redundant link has the structure of a tree. If any link fails, the network disconnects. Adding one extra link creates a cycle and introduces redundancy.

Thus trees represent the boundary between disconnected and cyclic structure.

They are connected enough to hold the graph together, but sparse enough to contain no alternate route.

26.15 Summary

A tree is a connected acyclic graph. This simple definition leads to several equivalent descriptions.

A finite graph is a tree precisely when it has unique simple paths between vertices, or when it is connected with n1n-1 edges, or when it is acyclic with n1n-1 edges.

Trees are minimally connected and maximally acyclic. Removing any edge disconnects them. Adding any missing edge creates a cycle.

Their importance comes from this balance. They are simple enough to support induction, recursion, and efficient algorithms, yet rich enough to model hierarchy, dependency, search, and minimal connectivity.