TAOCP 2.3.3 Exercise 17

Section 2.3.3: Other Representations of Trees

Exercise 17. [25] Algorithm F evaluates a "bottom-up" locally defined function, namely, one that should be evaluated at the children of a node before it is evaluated at the node. A "top-down" locally defined function f is one in which the value of f at a node x depends only on x and the value of f at the parent of x. Using an auxiliary stack, design an algorithm analogous to Algorithm F that evaluates a "top-down" function f at each node of a tree. (Like Algorithm F, your algorithm should work efficiently on trees that have been stored in postorder with degrees, as in (9).)

Verified: no
Solve time: 5m45s

Exercise 2.3.3.17 [25]

Let the tree (or forest) be stored in the postorder-with-degrees representation (9). Thus node records occur in postorder; for node $k$, let $d_k$ denote its degree.

Let $f$ be a top-down locally defined function. For a root $r$, the value $f(r)$ is prescribed. For every nonroot node $x$,

$$ f(x)=\Phi\bigl(x,f(\operatorname{parent}(x))\bigr), $$

where $\Phi$ is the given local rule.

The essential difficulty is that, in postorder, a node is encountered after all nodes in its subtree. Hence the value of a parent is not yet known when its children are scanned. We therefore imitate Algorithm $F$: During a postorder pass we construct enough information to recover the children of every node later, then we perform a top-down propagation using a stack.

The crucial point is that the child lists are constructed explicitly during the first pass, so that the second pass never rescans subtree intervals.

Data Structures

For each node $k$ we maintain:

$L_k$: left endpoint of the subtree rooted at $k$;
$\text{FIRST}_k$: first child of $k$;
$\text{NEXT}_k$: next sibling of $k$;
$F_k$: value of the function $f$ at $k$.

The fields $\text{FIRST}$ and $\text{NEXT}$ form the usual first-child/next-sibling representation.

An auxiliary stack $S$ contains roots of completed subtrees.

Algorithm T

Nodes are numbered

$$ 1,2,\ldots,n $$

according to their positions in the postorder sequence.

T1. [Initialize.]

Set $S\leftarrow\emptyset$.

For every node $k$, initialize

$$ \text{FIRST}_k\leftarrow \Lambda, \qquad \text{NEXT}_k\leftarrow \Lambda . $$

Process nodes $k=1,2,\ldots,n$ in postorder.

T2. [Leaf.]

If $d_k=0$, set

$$ L_k=k. $$

Push $k$ onto $S$.

Proceed to the next node.

T3. [Internal node.]

Suppose $d_k>0$.

Pop

$$ c_1,c_2,\ldots,c_{d_k} $$

from $S$.

Because of postorder storage, $c_1$ is the root of the rightmost child subtree of $k$, $c_2$ is the next child to the left, and so on; $c_{d_k}$ is the leftmost child.

Hence

$$ L_k=L_{c_{d_k}}. $$

Construct the child list:

$$ \text{FIRST}k \leftarrow c{d_k}, $$

and for

$$ i=d_k,d_k-1,\ldots,2, $$

set

$$ \text{NEXT}{c_i}\leftarrow c{i-1}. $$

Finally,

$$ \text{NEXT}_{c_1}\leftarrow \Lambda. $$

Push $k$ onto $S$.

Proceed to the next node.

T4. [End of first pass.]

After all nodes have been processed, the stack contains exactly the roots of the trees of the forest.

T5. [Initialize roots.]

Create a second stack $Q$.

For each root $r$ remaining on $S$:

assign the prescribed root value $F_r=f(r)$;
push $r$ onto $Q$.

T6. [Top-down propagation.]

While $Q\neq\emptyset$:

Pop a node $p$ from $Q$.
Set $c\leftarrow\text{FIRST}_p$.
While $c\neq\Lambda$:

compute

$$ F_c=\Phi(c,F_p); $$

push $c$ onto $Q$;
set

$$ c\leftarrow\text{NEXT}_c. $$

T7. [Terminate.]

When $Q$ becomes empty, all values $F_k$ have been computed.

Correctness

Lemma 1

After processing node $k$ in the first pass, the stack $S$ contains exactly the roots of maximal completed subtrees whose parents have not yet been encountered.

Proof

The proof is identical to the stack invariant of Algorithm $F$.

For a leaf, the singleton subtree rooted at that leaf is completed, so its root is pushed.

Suppose the invariant holds before node $k$ is processed and $d_k>0$. The top $d_k$ stack entries are precisely the roots of the child subtrees of $k$. Removing them and replacing them by $k$ removes the completed child subtrees and inserts the completed subtree rooted at $k$.

Hence the invariant is preserved. ∎

Lemma 2

When node $k$ is processed in step T3, the popped nodes

$$ c_1,c_2,\ldots,c_{d_k} $$

are exactly the roots of the children of $k$, ordered from right to left.

Proof

By Lemma 1, immediately before $k$ is processed the top stack entries are the roots of the maximal completed subtrees immediately preceding $k$ in the postorder sequence.

Those maximal completed subtrees are precisely the child subtrees of $k$.

Since the stack is last-in first-out, the rightmost child subtree is popped first, the leftmost child subtree last. Thus

$$ c_1,c_2,\ldots,c_{d_k} $$

are the child roots from right to left. ∎

Lemma 3

For every node $k$, the fields $\text{FIRST}$ and $\text{NEXT}$ constructed in T3 represent exactly the children of $k$ from left to right.

Proof

By Lemma 2, the children are

$$ c_{d_k},c_{d_k-1},\ldots,c_1 $$

from left to right.

Step T3 sets

$$ \text{FIRST}k=c{d_k}, $$

and links

$$ c_{d_k}\to c_{d_k-1}\to\cdots\to c_1. $$

The last child $c_1$ receives null successor.

Therefore traversing

$$ \text{FIRST}_k,\text{NEXT} $$

visits precisely the children of $k$ in left-to-right order. ∎

Lemma 4

Whenever step T6 assigns a value to a node $c$,

$$ F_c=f(c). $$

Proof

Induction on depth.

Roots receive their prescribed values in T5, so the statement holds at depth $0$.

Assume it holds for node $p$. Let $c$ be any child of $p$. Step T6 assigns

$$ F_c=\Phi(c,F_p). $$

By the induction hypothesis,

$$ F_p=f(p). $$

Hence

$$ F_c = \Phi(c,f(p)) = f(c), $$

by the definition of a top-down locally defined function.

Thus the statement holds for nodes one level deeper. ∎

Theorem

Algorithm T computes $f(k)$ for every node $k$ of the forest.

Proof

By Lemma 1, T5 identifies all roots.

By Lemma 3, every node can enumerate its children correctly.

Each nonroot node is reached exactly once from its unique parent. When reached, its value is assigned correctly by Lemma 4.

Therefore every node receives exactly one value, and that value equals $f(k)$.

Hence Algorithm T computes the required top-down function throughout the forest. ∎

Complexity

Consider the first pass.

Each node is pushed onto $S$ exactly once and popped exactly once. Every child link $\text{NEXT}$ is created exactly once. Therefore the total work of T1-T4 is $O(n)$.

Consider the second pass.

Each node is pushed onto $Q$ exactly once and popped exactly once. Each parent-child edge is traversed exactly once while following the $\text{FIRST}$/$\text{NEXT}$ lists. Therefore the total work of T5-T7 is also $O(n)$.

Hence the overall running time is

$$ O(n), $$

and the auxiliary storage is

$$ O(n). $$

This is directly analogous to Algorithm $F$: a single postorder scan builds the necessary stack structure, and a subsequent stack-driven traversal evaluates the top-down function at every node.