TAOCP 2.3.3 Exercise 4

Let the original forest contain $n$ nodes, of which $m$ are terminal.

Section 2.3.3: Other Representations of Trees

Exercise 4. [18] The trees (2) contain 10 nodes, five of which are terminal. Representation of these trees in the normal binary-tree fashion involves 10 LLINK fields and 10 RLINK fields (one for each node). Representation of these trees in the form (10), where LLINK and INFO share the same space in a node, requires 5 LLINKs and 15 RLINKs. There are 10 INFO fields in each case.

Given a forest with n nodes, m of which are terminal, compare the total number of LLINKs and RLINKs that must be stored using these two methods of tree representation.

Verified: yes
Solve time: 1m20s


Solution

Let the original forest contain $n$ nodes, of which $m$ are terminal.

In the normal binary-tree representation, every node contains both an LLINK field and an RLINK field. Therefore the total numbers of links stored are

$$ \text{LLINKs}=n,\qquad \text{RLINKs}=n. $$

Hence the total number of link fields is

$$ 2n. $$

Now consider representation $(10)$. Every terminal node of the original forest remains a terminal node and contains INFO; every nonterminal node loses its INFO and acquires a new terminal child containing that information.

Since there are $n-m$ nonterminal nodes, exactly $n-m$ new terminal nodes are added. Therefore the total number of nodes in representation $(10)$ is

$$ n+(n-m)=2n-m. $$

The only nodes that contain LLINK fields are the nonterminal nodes. After the transformation, the original terminal nodes remain terminal, and each of the $n-m$ added information nodes is also terminal. Thus the number of terminal nodes becomes

$$ m+(n-m)=n. $$

Consequently the number of nonterminal nodes is

$$ (2n-m)-n=n-m. $$

Each nonterminal node contains one LLINK, so representation $(10)$ requires

$$ \text{LLINKs}=n-m. $$

Every node, terminal or nonterminal, contains an RLINK field. Since there are $2n-m$ nodes altogether,

$$ \text{RLINKs}=2n-m. $$

Hence the total number of link fields stored is

$$ (n-m)+(2n-m)=3n-2m. $$

Therefore the comparison is

$$ \begin{array}{c|cc} & \text{Normal binary-tree} & (10)\ \hline \text{LLINKs} & n & n-m\ \text{RLINKs} & n & 2n-m\ \text{Total links} & 2n & 3n-2m \end{array} $$

The second representation uses

$$ (3n-2m)-2n=n-2m $$

more link fields than the ordinary representation when $n>2m$, fewer link fields when $n<2m$, and the same number when $n=2m$.

Thus the numbers of stored links are

$$ \boxed{\text{Normal: }(n,n),\qquad (10):,(n-m,,2n-m).} $$