9.5 Directed Graphs
You need to model relationships that have direction.
9.5 Directed Graphs
Problem
You need to model relationships that have direction.
Many real-world connections are not symmetric. If one task depends on another, the dependency works in only one direction. If a web page links to another page, the reverse link does not automatically exist. If a package imports another package, the imported package does not necessarily import the first one.
Treating these relationships as undirected often produces incorrect results.
You need a graph model that preserves direction explicitly.
Solution
Represent each connection as an ordered pair of vertices.
If an edge exists from vertex u to vertex v, write:
u → v
The reverse edge:
v → u
is a different edge.
For example:
Parse → TypeCheck
TypeCheck → Compile
Compile → Test
can be represented as:
graph = {
"Parse": ["TypeCheck"],
"TypeCheck": ["Compile"],
"Compile": ["Test"],
"Test": []
}
Traversal follows the direction of the edges.
Starting at Parse, the graph allows movement toward Test.
Starting at Test, there are no outgoing edges.
Direction matters.
Discussion
Directed graphs appear throughout computer science because many relationships naturally describe flow.
Examples include:
| Domain | Vertex | Edge |
|---|---|---|
| Build systems | Module | Imports |
| Package managers | Package | Dependency |
| Transportation | Airport | Flight |
| Web search | Page | Hyperlink |
| Workflows | Task | Precedence constraint |
| State machines | State | Transition |
| Social media | User | Follows |
In each case, reversing the edge changes the meaning.
A dependency graph containing:
Compiler → Parser
means something very different from:
Parser → Compiler
The distinction seems obvious when reading the graph, but many implementation bugs come from accidentally treating directed edges as undirected.
Thinking in Reachability
The most important question in a directed graph is often reachability.
Given two vertices:
A
B
can you travel from A to B by following edge directions?
Consider:
A → B → C → D
Starting from A, every vertex is reachable.
Starting from D, only D is reachable.
A DFS or BFS naturally respects edge direction.
def dfs(graph, vertex, visited):
if vertex in visited:
return
visited.add(vertex)
for neighbor in graph[vertex]:
dfs(graph, neighbor, visited)
Notice that traversal only follows outgoing edges.
No special handling is required.
The graph representation already encodes direction.
Sources and Sinks
Directed graphs often contain vertices with special structural roles.
A source is a vertex with no incoming edges.
A → B
A → C
Here:
A
is a source.
Nothing points to it.
A sink is a vertex with no outgoing edges.
A → B → C
Here:
C
is a sink.
Nothing leaves it.
Sources often represent starting points:
- Root tasks
- Initial states
- Entry points
- Origin servers
Sinks often represent final states:
- Completed tasks
- Terminal states
- Exit nodes
- Delivery destinations
Many algorithms begin by identifying sources or sinks.
Computing In-Degree and Out-Degree
For directed graphs, two degree measurements exist.
The out-degree counts outgoing edges.
The in-degree counts incoming edges.
Given:
A → B
A → C
B → C
The degrees are:
| Vertex | In-Degree | Out-Degree |
|---|---|---|
| A | 0 | 2 |
| B | 1 | 1 |
| C | 2 | 0 |
Computing out-degree is simple:
out_degree = len(graph["A"])
Computing in-degree requires scanning edges.
def indegrees(graph):
result = {
vertex: 0
for vertex in graph
}
for source in graph:
for target in graph[source]:
result[target] += 1
return result
Result:
{
"A": 0,
"B": 1,
"C": 2
}
In-degree information becomes important for topological sorting and dependency analysis.
Recipe: Build a Directed Graph
Given edge pairs:
edges = [
("A", "B"),
("A", "C"),
("B", "D"),
]
Build the graph:
def build_graph(edges):
graph = {}
for source, target in edges:
graph.setdefault(source, []).append(target)
graph.setdefault(target, [])
return graph
Result:
{
"A": ["B", "C"],
"B": ["D"],
"C": [],
"D": []
}
Notice that sink vertices are preserved.
This detail prevents many bugs later.
Detecting Cycles
Cycles are especially important in directed graphs.
A cycle exists when a path eventually returns to its starting vertex.
Example:
A → B → C → A
Cycles may indicate:
- Circular dependencies
- Infinite workflows
- Recursive state transitions
- Invalid scheduling constraints
Many graph problems require cycle-free graphs.
Dependency systems, build pipelines, and task schedulers often reject cyclic inputs.
A common DFS technique uses three states:
UNVISITED
VISITING
VISITED
Encountering a VISITING vertex during DFS indicates a cycle.
def has_cycle(graph):
state = {}
def visit(vertex):
if state.get(vertex) == "VISITING":
return True
if state.get(vertex) == "VISITED":
return False
state[vertex] = "VISITING"
for neighbor in graph[vertex]:
if visit(neighbor):
return True
state[vertex] = "VISITED"
return False
for vertex in graph:
if visit(vertex):
return True
return False
This pattern appears repeatedly throughout graph algorithms.
Directed Acyclic Graphs
A directed graph without cycles is called a Directed Acyclic Graph, often abbreviated DAG.
A → B
A → C
B → D
C → D
This graph contains no cycles.
DAGs are important because they model dependency structures.
Examples include:
- Build systems
- Compilation pipelines
- Task scheduling
- Spreadsheet calculations
- Data pipelines
- Workflow engines
Many efficient algorithms exist only because DAGs cannot contain cycles.
Topological sorting, discussed later in this chapter, relies entirely on this property.
Reverse Graphs
Sometimes incoming edges are more important than outgoing edges.
Instead of repeatedly scanning the graph, build a reversed version.
Original:
A → B
A → C
B → D
Reverse:
B → A
C → A
D → B
Construction is straightforward:
def reverse_graph(graph):
reverse = {
vertex: []
for vertex in graph
}
for source in graph:
for target in graph[source]:
reverse[target].append(source)
return reverse
Result:
{
"A": [],
"B": ["A"],
"C": ["A"],
"D": ["B"]
}
Reverse graphs are useful for:
- Strongly connected components
- Dependency analysis
- Backward reachability
- Reverse searches
Many advanced graph algorithms construct reverse graphs as a preprocessing step.
Complexity Characteristics
For a directed graph stored as an adjacency list:
| Operation | Complexity |
|---|---|
| Add edge | O(1) amortized |
| Out-degree lookup | O(1) |
| In-degree computation | O(V + E) |
| DFS | O(V + E) |
| BFS | O(V + E) |
| Reverse graph construction | O(V + E) |
The complexity is identical to undirected graph traversal.
The difference lies in interpretation and algorithmic behavior.
Common Pitfalls
Do not accidentally add reverse edges when the graph is directed.
This mistake silently converts the graph into an undirected graph.
Do not confuse in-degree with out-degree.
Do not assume reachability is symmetric.
If A reaches B, the reverse may not be true.
Do not forget sink vertices when constructing the graph.
Do not use undirected cycle-detection techniques on directed graphs. Directed cycles require different invariants.
Do not ignore self-loops.
A → A
is a cycle.
Many implementations forget this special case.
Recipe: Verify a Dependency Graph
Dependency systems frequently require a DAG.
A useful validation pipeline is:
graph = build_graph(edges)
if has_cycle(graph):
raise ValueError(
"Circular dependency detected"
)
This simple check prevents many classes of runtime failure.
Build systems, package managers, workflow schedulers, and orchestration engines often perform exactly this validation before execution begins.
Takeaway
Directed graphs model asymmetric relationships in which edge direction carries meaning. Reachability, dependencies, state transitions, workflows, and information flow all depend on preserving that direction. Understanding sources, sinks, in-degrees, cycles, DAGs, and reverse graphs provides the conceptual foundation for topological sorting, strongly connected components, shortest-path algorithms, and many of the advanced graph techniques that follow.