Chapter 20. Directed Acyclic Graphs

A directed acyclic graph is a directed graph with no directed cycle. It is usually abbreviated as DAG.

A DAG has direction but no circular directed motion. One may follow arrows from vertex to vertex, but one can never return to the starting vertex by continuing to follow arrows. This single restriction gives DAGs a strong order structure. In fact, a directed graph is a DAG exactly when its vertices can be placed in a topological order, meaning every arc points from an earlier vertex to a later vertex.

20.1 Definition

Let

D = (V,A)

be a directed graph. The graph $D$ is a directed acyclic graph if it contains no directed cycle.

A directed cycle has the form

v_0 \to v_1 \to \cdots \to v_{k-1} \to v_0,

where the vertices

v_0, v_1, \ldots, v_{k-1}

are distinct.

Thus $D$ is a DAG if no such sequence exists.

The word directed means that every edge has an orientation. The word acyclic means that no directed cycle exists. Both words are essential. A graph may contain cycles in its underlying undirected graph and still be a DAG, provided none of those cycles can be followed consistently in the direction of the arcs.

20.2 First Examples

Consider the directed graph

a \to b, \qquad a \to c, \qquad b \to d, \qquad c \to d.

This graph is a DAG. There are two directed paths from $a$ to $d$ :

a \to b \to d

and

a \to c \to d.

There is no directed path returning from $d$ to $a$ , or from any later vertex back to an earlier one.

Now consider

a \to b, \qquad b \to c, \qquad c \to a.

This graph is not a DAG. It contains the directed cycle

a \to b \to c \to a.

The underlying undirected graph alone does not decide whether a directed graph is acyclic. Direction decides the matter.

20.3 Acyclicity and Order

A DAG is a graph with an implicit order.

If an arc

u \to v

is present, then $u$ must come before $v$ in any ordering compatible with the graph.

This does not mean that every pair of vertices must be comparable. In the graph

a \to b, \qquad a \to c,

the vertex $a$ must come before both $b$ and $c$ . But $b$ and $c$ may appear in either order.

For example, both

a,b,c

and

a,c,b

are compatible with the arcs.

This partial ordering behavior is one of the central reasons DAGs appear in scheduling, dependency analysis, and computation.

20.4 Topological Ordering

A topological ordering of a directed graph is a linear ordering of its vertices such that every arc points forward in the ordering.

More precisely, an ordering

v_1, v_2, \ldots, v_n

is topological if, for every arc

(v_i,v_j) \in A,

the vertex $v_i$ appears before $v_j$ in the ordering.

A directed graph has a topological ordering if and only if it is acyclic. Therefore DAGs are exactly the directed graphs that admit topological orderings.

For example, in the graph

a \to b, \qquad a \to c, \qquad b \to d, \qquad c \to d,

both

a,b,c,d

and

a,c,b,d

are topological orderings.

The ordering

b,a,c,d

is not topological, because the arc $a \to b$ requires $a$ to appear before $b$ .

20.5 Sources and Sinks in DAGs

A source is a vertex with in-degree zero. A sink is a vertex with out-degree zero.

Every finite nonempty DAG has at least one source and at least one sink.

To see why a source must exist, suppose every vertex had in-degree at least one. Start at any vertex $v_0$ . Since $v_0$ has an incoming arc, choose a vertex $v_1$ such that

v_1 \to v_0.

Since $v_1$ also has an incoming arc, choose $v_2$ such that

v_2 \to v_1.

Continuing in this way, we obtain an infinite backward sequence of vertices. In a finite graph, some vertex must repeat. The repeated section gives a directed cycle, contradicting acyclicity.

Therefore a finite DAG must have a source.

The proof for sinks is similar, following outgoing arcs instead of incoming arcs.

20.6 Recursive Structure of DAGs

DAGs have a recursive structure.

Because every finite DAG has a source, one can remove a source and all arcs leaving it. The remaining graph is still a DAG. Removing vertices cannot create a directed cycle.

Thus a DAG can be reduced one source at a time.

This observation gives a natural way to construct a topological ordering:

Find a source.
Place it next in the ordering.
Remove it from the graph.
Repeat until no vertices remain.

This is the main idea behind Kahn’s algorithm for topological sorting. The algorithm repeatedly selects vertices with no remaining incoming edges and runs in linear time in the size of the graph.

20.7 DAGs and Partial Orders

A DAG defines a reachability relation.

For vertices $u$ and $v$ , define

u \preceq v

u = v

or there is a directed path from $u$ to $v$ .

This relation is reflexive, because every vertex reaches itself by the trivial path. It is transitive, because paths can be concatenated. It is antisymmetric because if

u \preceq v

and

v \preceq u,

with $u \neq v$ , then there is a directed path from $u$ to $v$ and another from $v$ to $u$ . Joining them gives a directed cycle.

Thus reachability in a DAG is a partial order.

This connection explains why DAGs are often used to represent precedence. They describe relations where some objects must come before others, but where not all pairs need to be ordered.

20.8 Transitive Closure

The transitive closure of a DAG is the directed graph that has an arc

u \to v

whenever $v$ is reachable from $u$ and $u \neq v$ .

The transitive closure makes all implied reachability relations explicit.

For example, if a DAG contains

a \to b, \qquad b \to c,

then $c$ is reachable from $a$ . The transitive closure contains the additional arc

a \to c.

The closure does not change reachability. It only records every reachable pair directly.

Transitive closure is useful when reachability queries must be answered quickly. It may, however, contain many more arcs than the original graph.

20.9 Transitive Reduction

The transitive reduction moves in the opposite direction. It removes arcs that are implied by longer directed paths.

If a DAG contains

a \to b, \qquad b \to c, \qquad a \to c,

then the arc

a \to c

is redundant for reachability, because $a$ already reaches $c$ through $b$ .

The transitive reduction removes such redundant arcs while preserving the same reachability relation. For finite DAGs, the transitive reduction is unique. It is closely related to Hasse diagrams of partial orders.

The transitive reduction is useful for drawing and understanding dependency structure. It keeps the essential covering relations and omits edges that follow from transitivity.

20.10 DAGs and Dependency Graphs

DAGs are the natural graph model for dependencies without circularity.

Suppose an arc

u \to v

means that $u$ must be completed before $v$ .

Then a directed cycle

v_0 \to v_1 \to \cdots \to v_{k-1} \to v_0

would say that $v_0$ must be completed before itself through a chain of requirements. Such a system has no consistent ordering.

Therefore valid dependency graphs are usually required to be DAGs.

Examples include:

System	Vertices	Arcs
Build system	Files or targets	prerequisite before dependent
Course plan	Courses	prerequisite before course
Task scheduler	Tasks	required task before later task
Spreadsheet	Cells	referenced cell before formula cell
Compiler	Computations	input before operation
Package manager	Packages	dependency before dependent package

A topological ordering gives a valid execution order.

20.11 DAGs in Version History

Version histories often form DAGs.

A commit may have one parent or several parents. An arc can be drawn from a parent commit to a child commit. Since a commit cannot be its own ancestor, the graph has no directed cycle.

Merges create vertices with multiple incoming arcs. Branches create vertices with multiple outgoing arcs. The resulting structure is generally not a tree, because branches can merge. It is naturally a DAG.

This example shows why DAGs are more general than trees. A tree has a unique parent structure. A DAG may allow many parents and many children while still forbidding directed cycles.

20.12 DAGs and Causality

DAGs are also used to model causal structure.

If an arc

u \to v

means that event $u$ may directly affect event $v$ , then acyclicity encodes the idea that causes precede their effects. A directed cycle would represent circular causation.

Causal diagrams, Bayesian networks, and many probabilistic graphical models use DAGs for this reason. The graph records which variables may directly influence which other variables.

The graph alone does not determine numerical probabilities. It gives the structural language in which such models are built.

20.13 DAGs and Dynamic Programming

Many dynamic programming problems can be understood as computation on a DAG.

Each subproblem is a vertex. An arc

u \to v

means that the solution of $v$ depends on the solution of $u$ .

A topological order gives a safe evaluation order: compute every predecessor before computing a vertex.

For example, in a shortest path problem on a DAG, one may process vertices in topological order and relax all outgoing arcs. Since no directed cycle exists, each dependency is handled once in a forward direction.

This yields efficient algorithms for several problems that are harder on general directed graphs.

20.14 Longest Paths in DAGs

The longest path problem is difficult in general directed graphs, because cycles may allow arbitrarily long walks or create combinatorial hardness.

In a DAG, longest paths can be computed efficiently.

Given a topological ordering, process vertices from left to right. For each arc

u \to v,

update the best known distance to $v$ using the best known distance to $u$ .

Acyclicity ensures that when a vertex is processed, all possible contributions from its predecessors have already been considered.

This is another example of the main computational advantage of DAGs: they turn recursive dependence into ordered computation.

20.15 Detecting Whether a Graph Is a DAG

A directed graph can be tested for acyclicity by attempting to compute a topological ordering.

If the algorithm successfully orders all vertices, the graph is a DAG. If at some step no source remains while vertices are still present, the remaining graph contains a directed cycle.

Depth-first search gives another test. During DFS, if an arc points to a vertex currently on the recursion stack, then a directed cycle exists. If no such arc appears, the graph is acyclic.

Both approaches can be implemented in linear time:

O(|V| + |A|).

This means the running time is proportional to the number of vertices plus the number of arcs.

20.16 Common Mistakes

A common mistake is to think that a DAG cannot have any cycle in its underlying undirected graph. This is false. A DAG cannot have directed cycles. Its underlying undirected graph may contain ordinary cycles.

Another mistake is to assume that a topological ordering is unique. A DAG may have many valid topological orderings. Uniqueness occurs only under additional conditions, such as when the ordering is forced by enough comparable pairs.

A third mistake is to confuse a DAG with a tree. Every rooted tree with edges directed away from the root is a DAG, but many DAGs are not trees.

A fourth mistake is to reverse dependency direction without changing interpretation. If $u \to v$ means “ $u$ must come before $v$ ,” then reversing all arcs changes the meaning of in-degree, out-degree, sources, and sinks.

20.17 Summary

A directed acyclic graph is a directed graph with no directed cycle.

DAGs are exactly the directed graphs that admit topological orderings. Every finite nonempty DAG has at least one source and at least one sink. Removing a source from a DAG leaves another DAG, which gives the recursive basis for topological sorting.

The reachability relation in a DAG forms a partial order. Transitive closure makes all reachability relations explicit, while transitive reduction removes redundant arcs and preserves reachability.

DAGs are the standard graph model for dependency, precedence, causality, version history, scheduling, and dynamic programming. They combine direction with order, making them one of the most useful structures in graph theory and computer science.