TAOCP 2.3.4.4 Exercise 15

Let the oriented tree be rooted, with every edge directed toward the root.

Section 2.3.4.4: Enumeration of Trees

Exercise 15. [**] [21] Discuss the relationships that exist (if any) between the topological sort algorithm of Section 2.2.3 and the canonical representation of an oriented tree.

Verified: yes
Solve time: 2m05s


Let the oriented tree be rooted, with every edge directed toward the root. If $v$ is not the root, let $f(v)$ denote its parent.

In such a tree, every nonroot vertex has outdegree $1$, and a leaf has indegree $0$. Therefore the vertices eligible for removal in the topological sort algorithm of §2.2.3 are exactly the leaves of the tree.

The canonical representation of §2.3.4 is constructed as follows. Repeatedly choose the smallest-numbered leaf $V_j$, record its parent $f(V_j)$, and delete $V_j$. This continues until only the root remains. Hence the sequence

$$ V_1,V_2,\ldots,V_{n-1} $$

is precisely the sequence of vertices deleted by repeatedly removing an indegree-$0$ vertex, with the tie-breaking rule that the smallest-numbered eligible vertex is chosen.

This is exactly the mechanism of topological sorting specialized to an oriented tree and equipped with the same tie-breaking rule. The correspondence is not a reversal of topological sorting. Rather, the canonical-representation procedure is a particular topological-sort process: at each step the vertices available to topological sort are the leaves, and the canonical representation chooses the smallest-numbered one.

Thus, if the topological-sort algorithm is applied to the oriented tree using the rule "choose the smallest-numbered vertex among those of indegree $0$," its output order is

$$ V_1,V_2,\ldots,V_{n-1},R, $$

where $R$ is the root. The first $n-1$ entries are exactly the deletion sequence used in the canonical representation.

The difference is that a topological ordering records only the sequence of removed vertices. The canonical representation records, for each deleted vertex $V_j$, the value $f(V_j)$. Equivalently, it records the pairs

$$ (V_j,f(V_j)), \qquad j=1,\ldots,n-1. $$

This additional parent information is what makes the representation one-to-one with oriented trees, since from these data the tree can be reconstructed uniquely. A topological ordering alone does not determine the tree.

Therefore the relationship is that the canonical representation is obtained by carrying out a topological-sort deletion process on the oriented tree, using a prescribed tie-breaking rule, and recording for each deleted vertex its parent as well as its position in the deletion order.