TAOCP 2.2.3 Exercise 23

Algorithm `T` performs a topological sort on a directed graph whose vertices are the objects being ordered and whose relations are the pairs $j \prec k$ represented in the successor lists.

Section 2.2.3: Linked Allocation

Exercise 23. [27] When the topological sort algorithm cannot proceed because it has detected a loop in the input (see step T8), it is usually of no use to stop and say, "There was a loop." It is helpful to print out one of the loops, thereby showing part of the input that was in error. Extend Algorithm T so that it will do this additional printing of a loop when necessary. [Hint: The text gives a proof for the existence of a loop when N > 0 in step T8; that proof suggests an algorithm.]

Verified: yes
Solve time: 1m07s


Setup

Algorithm T performs a topological sort on a directed graph whose vertices are the objects being ordered and whose relations are the pairs $j \prec k$ represented in the successor lists. Step T8 is reached when the queue of vertices having COUNT = 0 is empty while $N>0$ objects remain unprocessed.

The text proves that under these circumstances a directed loop must exist. The exercise asks for an extension of Algorithm T that prints one such loop.

The proof given in the text proceeds as follows: choose any remaining vertex; since its COUNT is positive, it has a predecessor among the remaining vertices. Repeating this process indefinitely yields a sequence of remaining vertices. Since only finitely many vertices remain, some vertex must repeat; the repeated portion of the sequence forms a directed loop.

The required algorithm is obtained by making this proof constructive.

Solution

When step T8 is entered, let

$$ R={,v : v \text{ has not yet been output},}. $$

Every vertex in $R$ has positive COUNT. If some vertex in $R$ had COUNT = 0$, it would be present in the queue maintained by Algorithm T, contradicting the fact that step T8` has been reached.

Construct an auxiliary table

$$ \text{PRED}[v] $$

for vertices $v\in R$ as follows. Scan all successor lists of the remaining vertices. Whenever a relation

$$ u\prec v $$

is encountered with both $u,v\in R$, store

$$ \text{PRED}[v]\leftarrow u. $$

Since COUNT(v)>0, at least one such predecessor exists for every $v\in R$; hence every entry PRED[v] is defined.

Choose any vertex $x_0\in R$ and generate the sequence

$$ x_{i+1}=\text{PRED}[x_i],\qquad i\ge0. $$

Because $R$ contains only finitely many vertices, there exist integers

$$ 0\le r<s $$

such that

$$ x_r=x_s. $$

Let $s$ be the first index at which a repetition occurs. Then

$$ x_r,x_{r+1},\ldots,x_{s-1} $$

are distinct.

For each $i$ with $r\le i<s-1$,

$$ x_{i+1}=\text{PRED}(x_i), $$

hence

$$ x_{i+1}\prec x_i. $$

Also,

$$ x_s=x_r=\text{PRED}(x_{s-1}), $$

so

$$ x_r\prec x_{s-1}. $$

Therefore

$$ x_r,\ x_{s-1},\ x_{s-2},\ \ldots,\ x_{r+1},\ x_r $$

is a directed loop in the original relation graph.

The extension of step T8 is therefore:

  1. Form the table PRED for all remaining vertices.
  2. Select any remaining vertex $x_0$.
  3. Follow predecessor links

$$ x_{i+1}\leftarrow \text{PRED}(x_i) $$

until a previously visited vertex is reached. 4. If the first repeated vertex is $x_r=x_s$, print

$$ x_r,\ x_{s-1},\ x_{s-2},\ldots,\ x_{r+1},\ x_r. $$ 5. Terminate with the loop report.

This procedure prints an actual directed cycle responsible for the failure of the topological sort.

Verification

The procedure must be shown both to terminate and to output a genuine loop.

Termination follows because the set $R$ is finite. The sequence

$$ x_0,x_1,x_2,\ldots $$

contains more than $|R|$ terms after $|R|+1$ iterations, so two terms must be equal. Hence a repeated vertex is eventually encountered.

To verify correctness, let $x_r=x_s$ be the first repetition. By construction,

$$ x_{i+1}\prec x_i $$

for every $r\le i<s-1$, and

$$ x_r=\text{PRED}(x_{s-1}), $$

which gives

$$ x_r\prec x_{s-1}. $$

Consequently the printed sequence

$$ x_r,\ x_{s-1},\ x_{s-2},\ldots,\ x_{r+1},\ x_r $$

follows directed relations at every step and returns to its starting point. Since $x_r,\ldots,x_{s-1}$ are distinct, this sequence is a directed cycle.

Thus whenever Algorithm T reaches step T8 with $N>0$, the extension above prints a loop and then terminates. This completes the proof.

Notes

The auxiliary table PRED need not contain all predecessors. Any single predecessor remaining in the graph is sufficient for each vertex. The algorithm is therefore a direct implementation of the existence proof given in the text.