9.9 Depth-First Search (DFS)
You need to explore a graph systematically.
9.9 Depth-First Search (DFS)
Problem
You need to explore a graph systematically.
Many graph problems can be reduced to a simple question:
Starting from a vertex, what can I reach?
Examples include:
- Finding connected components
- Detecting cycles
- Validating dependencies
- Exploring game states
- Traversing file systems
- Finding paths
- Computing graph properties
You need a traversal strategy that follows paths deeply before backtracking.
Solution
Use Depth-First Search (DFS).
DFS explores one path as far as possible before returning to earlier vertices.
Consider:
A
|
B
|\\nC D
|
E
Starting at A, one possible DFS traversal is:
A
B
C
E
D
The algorithm continually moves to an unvisited neighbor until it reaches a dead end.
Only then does it backtrack.
A recursive implementation looks like this:
def dfs(graph, vertex, visited):
visited.add(vertex)
print(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
Usage:
visited = set()
dfs(graph, "A", visited)
The traversal visits every reachable vertex exactly once.
Discussion
DFS is one of the fundamental graph algorithms.
Many more sophisticated algorithms are simply DFS with additional bookkeeping.
Examples include:
- Connected components
- Topological sorting
- Cycle detection
- Strongly connected components
- Articulation points
- Bridges
- Euler tours
- Heavy-light decomposition
Learning DFS thoroughly pays dividends throughout graph theory.
The Core Idea
DFS maintains a simple invariant:
Every visited vertex has either been completely explored or is currently being explored.
When the algorithm reaches a vertex, it recursively explores its neighbors.
Only after every descendant has been processed does it return.
This creates a natural depth-first behavior.
Consider:
A
|
B
|
C
|
D
Traversal:
A
B
C
D
DFS reaches D before returning to C.
Then:
D complete
C complete
B complete
A complete
This completion order is extremely important in later algorithms.
Recursive DFS
The recursive version is usually the easiest to understand.
def dfs(graph, vertex, visited):
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
This implementation naturally uses the call stack.
Suppose:
graph = {
"A": ["B", "C"],
"B": ["D"],
"C": [],
"D": [],
}
Execution proceeds:
dfs(A)
dfs(B)
dfs(D)
dfs(C)
The call stack stores unfinished work automatically.
Recursive Traversal Order
To understand DFS, it helps to visualize the call stack.
Given:
A
|\\nB C
|
D
Execution:
visit A
visit B
visit D
return D
return B
visit C
return C
return A
Notice that DFS does not necessarily visit vertices in numerical order, alphabetical order, or shortest-path order.
The traversal order depends entirely on adjacency-list ordering.
For this reason, many algorithms rely on DFS invariants rather than exact visitation order.
Iterative DFS
Recursion is elegant, but explicit stacks provide more control.
def dfs_iterative(graph, start):
visited = set()
stack = [start]
while stack:
vertex = stack.pop()
if vertex in visited:
continue
visited.add(vertex)
for neighbor in reversed(graph[vertex]):
stack.append(neighbor)
This implementation produces behavior similar to recursive DFS.
The explicit stack replaces the language runtime's call stack.
Many production systems prefer this version because stack usage becomes predictable.
DFS Trees
DFS naturally constructs a traversal tree.
Consider:
A -- B
| |
C -- D
Starting from A, DFS might produce:
A
|
B
|
D
|
C
The edges used to discover new vertices form the DFS tree.
Tree edges represent first discoveries.
Remaining edges become non-tree edges.
Many graph algorithms classify edges using this DFS tree structure.
Recipe: Visit Every Vertex
DFS launched from one starting vertex only explores the reachable region.
Disconnected graphs require multiple launches.
def dfs_all(graph):
visited = set()
def dfs(vertex):
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
dfs(neighbor)
for vertex in graph:
if vertex not in visited:
dfs(vertex)
This pattern appears repeatedly in graph algorithms.
It guarantees complete coverage of the graph.
Connected Components
DFS makes connected-component detection almost trivial.
def connected_components(graph):
visited = set()
components = []
def dfs(vertex, component):
visited.add(vertex)
component.append(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
dfs(neighbor, component)
for vertex in graph:
if vertex in visited:
continue
component = []
dfs(vertex, component)
components.append(component)
return components
Example:
A -- B
C -- D
E
Output:
[
["A", "B"],
["C", "D"],
["E"]
]
This is one of the most common applications of DFS.
Recipe: Find a Path
DFS can also find a path between vertices.
def find_path(graph, start, target):
visited = set()
def dfs(vertex, path):
if vertex == target:
return path
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
result = dfs(
neighbor,
path + [neighbor]
)
if result:
return result
return None
return dfs(start, [start])
Example:
path = find_path(
graph,
"A",
"D"
)
Possible output:
["A", "B", "D"]
DFS does not guarantee the shortest path.
It guarantees only that a path is found if one exists.
Discovery and Finish Times
Many advanced algorithms attach timestamps to DFS events.
When a vertex is first encountered:
discovery time
When exploration finishes:
finish time
Example:
time = 0
def dfs(vertex):
global time
discovery[vertex] = time
time += 1
for neighbor in graph[vertex]:
if neighbor not in discovery:
dfs(neighbor)
finish[vertex] = time
time += 1
These timestamps form the basis for:
- Topological sorting
- Strongly connected components
- Edge classification
- Dominator analysis
The concept is simple but extremely powerful.
DFS on Large Graphs
Recursion depth can become a problem.
Consider a chain:
1 → 2 → 3 → 4 → ...
with millions of vertices.
A recursive implementation may exhaust the call stack.
Iterative DFS avoids this limitation.
stack = [start]
grows in heap memory rather than call-stack memory.
When graph depth is unknown, iterative DFS is often safer.
Complexity Analysis
Let:
V= number of verticesE= number of edges
DFS visits every vertex once.
Each edge is examined at most once in a directed graph and at most twice in an undirected graph.
Total complexity:
| Operation | Complexity |
|---|---|
| DFS traversal | O(V + E) |
| Memory (visited set) | O(V) |
| Recursive stack | O(V) worst case |
| Iterative stack | O(V) worst case |
This linear complexity is one reason DFS appears so frequently.
Common Pitfalls
Do not forget the visited set.
Without it, cycles cause infinite recursion.
A → B → C → A
would never terminate.
Do not assume DFS finds shortest paths.
It does not.
Do not rely on traversal order unless the algorithm explicitly requires it.
Neighbor ordering affects DFS output.
Do not launch DFS from only one vertex when the graph may be disconnected.
Do not ignore recursion limits when processing large graphs.
Do not modify adjacency lists during traversal.
This often creates subtle correctness bugs.
Recipe: Detect Reachability
A simple reachability test is often sufficient.
def reachable(graph, source, target):
visited = set()
def dfs(vertex):
if vertex == target:
return True
visited.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
if dfs(neighbor):
return True
return False
return dfs(source)
Example:
reachable(graph, "A", "D")
Output:
True
This pattern appears frequently in validation, dependency analysis, and state-space exploration.
Takeaway
Depth-First Search explores graphs by following one path as deeply as possible before backtracking. Its implementation is simple, but its influence is enormous. Connected components, cycle detection, topological sorting, strongly connected components, articulation points, bridges, and many advanced graph algorithms all build directly on DFS. Understanding recursion, traversal trees, discovery times, finish times, and the visited-set invariant provides a foundation for nearly every graph technique that follows.