TAOCP 2.3.1 Exercise 37

Let a binary tree have $n$ nodes.

Section 2.3.1: Traversing Binary Trees

Exercise 37. [24] (D. Ferguson.) If two computer words are necessary to contain two link fields and an INFO field, representation (2) requires 2n words of memory for a tree with n nodes. Design a representation scheme for binary trees that uses less space, assuming that one link and an INFO field will fit in a single computer word.

Verified: yes
Solve time: 2m40s


Solution

Let a binary tree have $n$ nodes. Under representation (2), each node contains

$$ (\text{LLINK},\text{RLINK},\text{INFO}), $$

and the hypothesis of the exercise states that two words are needed to store these three fields. Hence the total storage is $2n$ words.

To obtain a more economical representation, we exploit the fact that a binary tree with $n$ nodes contains many null links.

A binary tree has $2n$ link fields altogether. Since every node except the root is reached by exactly one link, there are $n-1$ non-null links. Therefore the number of null links is

$$ 2n-(n-1)=n+1. $$

The idea is to store only the $n-1$ actual links explicitly.

Representation

Store the nodes consecutively in an array

$$ A[1],A[2],\ldots,A[n]. $$

Each array entry contains one word:

$$ (\text{INFO},\text{LINK}), $$

where the LINK field is used as follows.

For every non-root node $P$, let LINK$(P)$ point to its parent. Thus exactly $n-1$ links are stored, one for each non-root node.

The root is identified separately.

To determine whether a node is a left child or a right child of its parent, reserve one bit in the LINK field. Since the problem assumes that one link together with INFO fits in one word, a single tag bit can be incorporated into the link representation.

Thus each node occupies exactly one word.

Reconstruction of the tree

Given the parent links and the root, the entire binary-tree structure is recoverable.

For each node $Q$, examine all nodes $P$ whose parent link points to $Q$. Since the structure is a binary tree, there are at most two such nodes. The stored tag bit indicates whether $P$ is the left child or the right child of $Q$.

Hence the left and right child relationships can be reconstructed uniquely from the collection of parent links.

Storage analysis

Each of the $n$ nodes occupies one word. Therefore the total storage required is

$$ n \text{ words}. $$

This is strictly less than the $2n$ words required by representation (2).

The saving arises because only the $n-1$ non-null links are stored. The $n+1$ null links are represented implicitly.

Therefore a valid representation scheme is:

  • one word per node containing $\text{INFO}$ and a parent link;
  • one additional bit indicating whether the node is a left or right child;
  • a distinguished root node.

This representation requires only $n$ words for a tree with $n$ nodes, which is less than $2n$. ∎