TAOCP 2.2.3 Exercise 20

Section 2.2.3: Linked Allocation

Exercise 20. [24] Algorithm T uses F, R, and the QLINK table to obtain the effect of a queue that contains those nodes whose COUNT field has become zero but whose successor relations have not yet been removed. Could a stack be used for this purpose instead of a queue? If so, compare the resulting algorithm with Algorithm T.

Verified: yes
Solve time: 40s

Solution

Yes.

Algorithm T is Kahn's topological sorting procedure. The queue contains precisely those nodes whose COUNT field has become zero and whose outgoing relations have not yet been processed. The essential property required of this data structure is that it support the following operations:

Insert a node when its COUNT becomes zero.
Remove some node whose COUNT is zero.
Process that node exactly once.

The correctness proof of Algorithm T does not depend on the order in which zero-count nodes are selected. At any stage, every node in the queue has indegree zero in the remaining graph. When a node is removed and its successor relations are deleted, the only effect is to reduce the COUNT values of its successors. Therefore any node whose COUNT becomes zero may be processed next. The argument establishing correctness of Algorithm T remains valid if the set of available zero-count nodes is maintained as a stack rather than as a queue.

To obtain a stack version, replace the queue variables F, R, and the table QLINK by a single stack pointer T. Initially, whenever Algorithm T would insert a node into the rear of the queue, it instead pushes the node onto the stack:

$$ \text{LINK}(P)\leftarrow T,\qquad T\leftarrow P. $$

Whenever Algorithm T would remove the front element of the queue, it instead pops the top element of the stack:

$$ P\leftarrow T,\qquad T\leftarrow \text{LINK}(P). $$

The resulting algorithm still outputs a topological ordering. Different topological orderings may be produced, because among several simultaneously available zero-count nodes the stack selects the most recently inserted one, whereas the queue selects the least recently inserted one.

The stack version is slightly simpler. A linked queue requires two pointer variables, F and R, together with the link field used to chain the queue elements. A linked stack requires only one pointer variable, T. Push and pop operations are also simpler than queue insertion and deletion, since no special treatment of the rear of the queue is needed.

The output order differs. The queue tends to process zero-count nodes in the order in which they become available, while the stack processes the most recently available node first. Both orders satisfy the requirements of a topological sort.

Hence a stack can be used in place of the queue, yielding a correct topological-sorting algorithm that is slightly simpler in storage requirements and implementation, but which generally produces a different topological ordering.

$$ \boxed{\text{A stack may replace the queue; correctness is unchanged, but the resulting topological order is generally different.}} $$