TAOCP 2.3.3 Exercise 16

Let the given forest be represented in postorder with degrees as in representation `(9)`.

Section 2.3.3: Other Representations of Trees

Exercise 16. [M24] Prove the validity of Algorithm F.

Verified: yes
Solve time: 1m36s


Solution

Let the given forest be represented in postorder with degrees as in representation (9). For each node $x$, let $\deg(x)$ denote its degree, and let $f(x)$ be the value to be computed. By hypothesis, $f(x)$ depends only on $x$ and on the values of $f$ at the children of $x$.

We prove that Algorithm $F$ is valid.

The proof is by induction on the number of nodes already processed in postorder.

Let

$$ P_1,P_2,\ldots,P_n $$

be the nodes of the forest in postorder order. For $k\ge1$, let $S_k$ denote the contents of the stack immediately before step $F2$ is executed with $P=P_k$.

We establish the following invariant.

Invariant. Immediately before processing $P_k$, the stack contains exactly the values

$$ f(r_t),\ldots,f(r_1), $$

from top to bottom, where $r_1,\ldots,r_t$ are precisely those roots of maximal processed subtrees whose nodes lie among

$$ P_1,\ldots,P_{k-1}, $$

but whose parents, if any, have not yet been processed. Furthermore, whenever $d=\deg(P_k)$, the top $d$ items of the stack are

$$ f(x_d),\ldots,f(x_1), $$

from top to bottom, where $x_1,\ldots,x_d$ are the children of $P_k$ from left to right.

The second part of the invariant is exactly the condition asserted in step $F2$.

For the basis, consider $k=1$. Since postorder visits every child before its parent, the first node $P_1$ must be terminal. Hence

$$ \deg(P_1)=0. $$

Step $F1$ initializes the stack to be empty. Therefore the top $d=0$ items of the stack are exactly the values corresponding to the children of $P_1$, since there are no such children. The invariant holds.

Assume the invariant holds before processing $P_k$.

Let

$$ d=\deg(P_k), $$

and let $x_1,\ldots,x_d$ be the children of $P_k$ from left to right.

Because the representation is postorder, every node in each subtree rooted at $x_i$ precedes $P_k$. Moreover, after the entire subtree rooted at $x_i$ has been processed, the value $f(x_i)$ remains on the stack until $P_k$ is processed, since no other node can use that value except an ancestor of $x_i$.

The subtrees rooted at

$$ x_1,\ldots,x_d $$

occur consecutively immediately before $P_k$ in postorder. Consequently the values

$$ f(x_1),\ldots,f(x_d) $$

occupy the top portion of the stack immediately before $P_k$ is processed, and their order on the stack is

$$ f(x_d),\ldots,f(x_1) $$

from top to bottom. Hence the information required by step $F2$ is available.

Step $F2$ therefore computes the correct value $f(P_k)$.

Step $F3$ removes the $d$ values

$$ f(x_d),\ldots,f(x_1) $$

and pushes $f(P_k)$.

This operation exactly replaces the completed child subtrees of $P_k$ by the completed subtree rooted at $P_k$. Therefore the invariant is preserved for the next node $P_{k+1}$, if $k<n$.

By induction, the invariant holds at every execution of step $F2$, and the value computed there is always the correct value of $f$ for the current node.

When $k=n$, the last node processed is the root of the final tree of the forest. Step $F4$ terminates. At this moment every tree has been reduced, through repeated applications of step $F3$, to a single stack entry equal to the value of $f$ at its root. If the forest consists of trees

$$ T_1,\ldots,T_m, $$

then the stack contains

$$ f(\operatorname{root}(T_m)),\ldots, f(\operatorname{root}(T_1)) $$

from top to bottom, exactly as stated in step $F4$.

Thus every node receives the correct value of $f$, and the algorithm terminates after processing each node once.

This completes the proof.