TAOCP 2.3.3 Exercise 16
Let the given forest be represented in postorder with degrees as in representation `(9)`.
Section 2.3.3: Other Representations of Trees
Exercise 16. [M24] Prove the validity of Algorithm F.
Verified: yes
Solve time: 1m36s
Solution
Let the given forest be represented in postorder with degrees as in representation (9). For each node $x$, let $\deg(x)$ denote its degree, and let $f(x)$ be the value to be computed. By hypothesis, $f(x)$ depends only on $x$ and on the values of $f$ at the children of $x$.
We prove that Algorithm $F$ is valid.
The proof is by induction on the number of nodes already processed in postorder.
Let
$$ P_1,P_2,\ldots,P_n $$
be the nodes of the forest in postorder order. For $k\ge1$, let $S_k$ denote the contents of the stack immediately before step $F2$ is executed with $P=P_k$.
We establish the following invariant.
Invariant. Immediately before processing $P_k$, the stack contains exactly the values
$$ f(r_t),\ldots,f(r_1), $$
from top to bottom, where $r_1,\ldots,r_t$ are precisely those roots of maximal processed subtrees whose nodes lie among
$$ P_1,\ldots,P_{k-1}, $$
but whose parents, if any, have not yet been processed. Furthermore, whenever $d=\deg(P_k)$, the top $d$ items of the stack are
$$ f(x_d),\ldots,f(x_1), $$
from top to bottom, where $x_1,\ldots,x_d$ are the children of $P_k$ from left to right.
The second part of the invariant is exactly the condition asserted in step $F2$.
For the basis, consider $k=1$. Since postorder visits every child before its parent, the first node $P_1$ must be terminal. Hence
$$ \deg(P_1)=0. $$
Step $F1$ initializes the stack to be empty. Therefore the top $d=0$ items of the stack are exactly the values corresponding to the children of $P_1$, since there are no such children. The invariant holds.
Assume the invariant holds before processing $P_k$.
Let
$$ d=\deg(P_k), $$
and let $x_1,\ldots,x_d$ be the children of $P_k$ from left to right.
Because the representation is postorder, every node in each subtree rooted at $x_i$ precedes $P_k$. Moreover, after the entire subtree rooted at $x_i$ has been processed, the value $f(x_i)$ remains on the stack until $P_k$ is processed, since no other node can use that value except an ancestor of $x_i$.
The subtrees rooted at
$$ x_1,\ldots,x_d $$
occur consecutively immediately before $P_k$ in postorder. Consequently the values
$$ f(x_1),\ldots,f(x_d) $$
occupy the top portion of the stack immediately before $P_k$ is processed, and their order on the stack is
$$ f(x_d),\ldots,f(x_1) $$
from top to bottom. Hence the information required by step $F2$ is available.
Step $F2$ therefore computes the correct value $f(P_k)$.
Step $F3$ removes the $d$ values
$$ f(x_d),\ldots,f(x_1) $$
and pushes $f(P_k)$.
This operation exactly replaces the completed child subtrees of $P_k$ by the completed subtree rooted at $P_k$. Therefore the invariant is preserved for the next node $P_{k+1}$, if $k<n$.
By induction, the invariant holds at every execution of step $F2$, and the value computed there is always the correct value of $f$ for the current node.
When $k=n$, the last node processed is the root of the final tree of the forest. Step $F4$ terminates. At this moment every tree has been reduced, through repeated applications of step $F3$, to a single stack entry equal to the value of $f$ at its root. If the forest consists of trees
$$ T_1,\ldots,T_m, $$
then the stack contains
$$ f(\operatorname{root}(T_m)),\ldots, f(\operatorname{root}(T_1)) $$
from top to bottom, exactly as stated in step $F4$.
Thus every node receives the correct value of $f$, and the algorithm terminates after processing each node once.
This completes the proof.
∎