9.9 Depth-First Search (DFS)

Problem

You need to explore a graph systematically.

Many graph problems can be reduced to a simple question:

Starting from a vertex, what can I reach?

Examples include:

Finding connected components
Detecting cycles
Validating dependencies
Exploring game states
Traversing file systems
Finding paths
Computing graph properties

You need a traversal strategy that follows paths deeply before backtracking.

Solution

Use Depth-First Search (DFS).

DFS explores one path as far as possible before returning to earlier vertices.

Consider:

A
|
B
|\\nC D
|
E

Starting at A, one possible DFS traversal is:

A
B
C
E
D

The algorithm continually moves to an unvisited neighbor until it reaches a dead end.

Only then does it backtrack.

A recursive implementation looks like this:

def dfs(graph, vertex, visited):
    visited.add(vertex)

    print(vertex)

    for neighbor in graph[vertex]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)

Usage:

visited = set()
dfs(graph, "A", visited)

The traversal visits every reachable vertex exactly once.

Discussion

DFS is one of the fundamental graph algorithms.

Many more sophisticated algorithms are simply DFS with additional bookkeeping.

Examples include:

Connected components
Topological sorting
Cycle detection
Strongly connected components
Articulation points
Bridges
Euler tours
Heavy-light decomposition

Learning DFS thoroughly pays dividends throughout graph theory.

The Core Idea

DFS maintains a simple invariant:

Every visited vertex has either been completely explored or is currently being explored.

When the algorithm reaches a vertex, it recursively explores its neighbors.

Only after every descendant has been processed does it return.

This creates a natural depth-first behavior.

Consider:

A
|
B
|
C
|
D

Traversal:

A
  B
    C
      D

DFS reaches D before returning to C.

Then:

D complete
C complete
B complete
A complete

This completion order is extremely important in later algorithms.

Recursive DFS

The recursive version is usually the easiest to understand.

def dfs(graph, vertex, visited):
    visited.add(vertex)

    for neighbor in graph[vertex]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited)

This implementation naturally uses the call stack.

Suppose:

graph = {
    "A": ["B", "C"],
    "B": ["D"],
    "C": [],
    "D": [],
}

Execution proceeds:

dfs(A)
  dfs(B)
    dfs(D)
  dfs(C)

The call stack stores unfinished work automatically.

Recursive Traversal Order

To understand DFS, it helps to visualize the call stack.

Given:

A
|\\nB C
|
D

Execution:

visit A

visit B

visit D

return D

return B

visit C

return C

return A

Notice that DFS does not necessarily visit vertices in numerical order, alphabetical order, or shortest-path order.

The traversal order depends entirely on adjacency-list ordering.

For this reason, many algorithms rely on DFS invariants rather than exact visitation order.

Iterative DFS

Recursion is elegant, but explicit stacks provide more control.

def dfs_iterative(graph, start):
    visited = set()
    stack = [start]

    while stack:
        vertex = stack.pop()

        if vertex in visited:
            continue

        visited.add(vertex)

        for neighbor in reversed(graph[vertex]):
            stack.append(neighbor)

This implementation produces behavior similar to recursive DFS.

The explicit stack replaces the language runtime's call stack.

Many production systems prefer this version because stack usage becomes predictable.

DFS Trees

DFS naturally constructs a traversal tree.

Consider:

A -- B
|    |
C -- D

Starting from A, DFS might produce:

A
|
B
|
D
|
C

The edges used to discover new vertices form the DFS tree.

Tree edges represent first discoveries.

Remaining edges become non-tree edges.

Many graph algorithms classify edges using this DFS tree structure.

Recipe: Visit Every Vertex

DFS launched from one starting vertex only explores the reachable region.

Disconnected graphs require multiple launches.

def dfs_all(graph):
    visited = set()

    def dfs(vertex):
        visited.add(vertex)

        for neighbor in graph[vertex]:
            if neighbor not in visited:
                dfs(neighbor)

    for vertex in graph:
        if vertex not in visited:
            dfs(vertex)

This pattern appears repeatedly in graph algorithms.

It guarantees complete coverage of the graph.

Connected Components

DFS makes connected-component detection almost trivial.

def connected_components(graph):
    visited = set()
    components = []

    def dfs(vertex, component):
        visited.add(vertex)
        component.append(vertex)

        for neighbor in graph[vertex]:
            if neighbor not in visited:
                dfs(neighbor, component)

    for vertex in graph:
        if vertex in visited:
            continue

        component = []
        dfs(vertex, component)
        components.append(component)

    return components

Example:

A -- B

C -- D

E

Output:

[
    ["A", "B"],
    ["C", "D"],
    ["E"]
]

This is one of the most common applications of DFS.

Recipe: Find a Path

DFS can also find a path between vertices.

def find_path(graph, start, target):
    visited = set()

    def dfs(vertex, path):
        if vertex == target:
            return path

        visited.add(vertex)

        for neighbor in graph[vertex]:
            if neighbor not in visited:
                result = dfs(
                    neighbor,
                    path + [neighbor]
                )

                if result:
                    return result

        return None

    return dfs(start, [start])

Example:

path = find_path(
    graph,
    "A",
    "D"
)

Possible output:

["A", "B", "D"]

DFS does not guarantee the shortest path.

It guarantees only that a path is found if one exists.

Discovery and Finish Times

Many advanced algorithms attach timestamps to DFS events.

When a vertex is first encountered:

discovery time

When exploration finishes:

finish time

Example:

time = 0

def dfs(vertex):
    global time

    discovery[vertex] = time
    time += 1

    for neighbor in graph[vertex]:
        if neighbor not in discovery:
            dfs(neighbor)

    finish[vertex] = time
    time += 1

These timestamps form the basis for:

Topological sorting
Strongly connected components
Edge classification
Dominator analysis

The concept is simple but extremely powerful.

DFS on Large Graphs

Recursion depth can become a problem.

Consider a chain:

1 → 2 → 3 → 4 → ...

with millions of vertices.

A recursive implementation may exhaust the call stack.

Iterative DFS avoids this limitation.

stack = [start]

grows in heap memory rather than call-stack memory.

When graph depth is unknown, iterative DFS is often safer.

Complexity Analysis

Let:

V = number of vertices
E = number of edges

DFS visits every vertex once.

Each edge is examined at most once in a directed graph and at most twice in an undirected graph.

Total complexity:

Operation	Complexity
DFS traversal	O(V + E)
Memory (visited set)	O(V)
Recursive stack	O(V) worst case
Iterative stack	O(V) worst case

This linear complexity is one reason DFS appears so frequently.

Common Pitfalls

Do not forget the visited set.

Without it, cycles cause infinite recursion.

A → B → C → A

would never terminate.

Do not assume DFS finds shortest paths.

It does not.

Do not rely on traversal order unless the algorithm explicitly requires it.

Neighbor ordering affects DFS output.

Do not launch DFS from only one vertex when the graph may be disconnected.

Do not ignore recursion limits when processing large graphs.

Do not modify adjacency lists during traversal.

This often creates subtle correctness bugs.

Recipe: Detect Reachability

A simple reachability test is often sufficient.

def reachable(graph, source, target):
    visited = set()

    def dfs(vertex):
        if vertex == target:
            return True

        visited.add(vertex)

        for neighbor in graph[vertex]:
            if neighbor not in visited:
                if dfs(neighbor):
                    return True

        return False

    return dfs(source)

Example:

reachable(graph, "A", "D")

Output:

True

This pattern appears frequently in validation, dependency analysis, and state-space exploration.

Takeaway

Depth-First Search explores graphs by following one path as deeply as possible before backtracking. Its implementation is simple, but its influence is enormous. Connected components, cycle detection, topological sorting, strongly connected components, articulation points, bridges, and many advanced graph algorithms all build directly on DFS. Understanding recursion, traversal trees, discovery times, finish times, and the visited-set invariant provides a foundation for nearly every graph technique that follows.