TAOCP 2.3.4.5 Exercise 10
Let $w_1,\ldots,w_m\ge 0$.
Exercise 10. [**] [M26] (D. Huffman.) Show how to construct a $t$-ary tree with minimum weighted path length, given nonnegative weights $w_1,\ldots,w_m$. Construct the optimal ternary tree corresponding to exercise 2.
Verified: yes
Solve time: 1m46s
Let $w_1,\ldots,w_m\ge 0$. A full $t$-ary Huffman tree exists only when the number of leaves can support repeated contraction by groups of $t$. This requires the standard feasibility condition
$$ m \equiv 1 \pmod{t-1}. $$
If this does not hold, adjoin $r$ dummy leaves of weight $0$, where $r\in{0,\ldots,t-2}$ is chosen so that
$$ m+r \equiv 1 \pmod{t-1}. $$
Adding zero-weight leaves does not change the weighted path length of any tree on the original weights, so it suffices to solve the augmented instance.
Greedy construction
Start with the multiset of weights (including added zeros if needed). Repeatedly perform:
- Select $t$ elements of smallest weight $a_1\le \cdots \le a_t$.
- Replace them by a single weight
$$ a = a_1+\cdots+a_t, $$
interpreted as the weight of a new internal node whose children are these $t$ nodes.
Continue until a single weight remains. The sequence of merges defines a full $t$-ary tree.
The weighted path length equals the sum of all internal node weights created during the process, since each merge increases the depth of each descendant by $1$, contributing exactly the merged weight once per unit depth.
Key structural lemma
Let $T$ be an optimal full $t$-ary tree for the augmented weights.
Lemma 1 (Deepest nodes are smallest weights).
If a leaf of weight $x$ is deeper than a leaf of weight $y$, then $x \le y$ in some optimal tree.
Proof. If $x>y$ and $x$ is deeper than $y$, swapping their positions strictly decreases the weighted path length, since the heavier weight is moved upward and the lighter weight downward. Repeating swaps yields an optimal tree where weights are nondecreasing with depth.
Lemma 2 (Sibling grouping at maximum depth).
There exists an optimal tree in which the $t$ smallest weights are leaves of maximum depth and share a common parent.
Proof idea. Among leaves of maximum depth, pick any $t$. By Lemma 1, these can be taken to be the $t$ smallest weights among all leaves at that depth. If they do not share a parent, then their parents can be locally rearranged: in a full $t$-ary tree, subtrees at the same depth can be exchanged without changing feasibility. One can restructure so that these $t$ leaves become siblings while preserving all leaf depths. This does not increase weighted path length and preserves optimality.
Optimal substructure and induction
We now show the greedy step is valid.
Take an optimal tree $T$. By Lemma 2, there is an optimal tree in which the $t$ smallest weights $a_1,\ldots,a_t$ are siblings at maximum depth with common parent $p$. Replace this entire group by a single leaf of weight
$$ a = a_1+\cdots+a_t. $$
Call the resulting reduced instance $I'$, which has $m' = m-(t-1)$ leaves.
The weighted path length of $T$ can be written as
$$ W(T)=W(T')+a_1+\cdots+a_t, $$
where $T'$ is the contracted tree. Any improvement in $T'$ would extend to an improvement in $T$, so $T'$ must be optimal for $I'$. Thus the problem reduces to a smaller instance of the same form.
By induction on $m$, repeatedly applying this reduction yields exactly the greedy algorithm: at each stage the $t$ smallest weights are merged.
Conclusion of optimality
The induction proves that the greedy construction always produces an optimal full $t$-ary tree for the augmented instance, hence also for the original weights (since zero-weight leaves do not affect optimality on original leaves).
Ternary case ($t=3$)
For $t=3$, first ensure $m\equiv 1 \pmod 2$ by adding a single dummy leaf if $m$ is even. Then repeatedly:
- choose the three smallest weights,
- replace them by their sum,
until one weight remains. Expanding the recorded merges yields the optimal ternary Huffman tree corresponding to the given weights (including those from exercise 2).