TAOCP 2.3.4.5 Exercise 17
Critical correction starts from the only valid backbone: part (a) and (b) already give a majorization principle, but the previous solution incorrectly tried to replace that with an assumed Huffman str...
Exercise 17. [**] [HM30] (C. R. Glassey and R. M. Karp.) Let $s_1,\ldots,s_{m-1}$ be the numbers inside the internal (circular) nodes of an extended binary tree formed by Huffman's algorithm, in the order of construction. Let $s'1,\ldots,s'{m-1}$ be the internal node weights of any extended binary tree on the same set of weights ${w_1,\ldots,w_m}$, listed in any order such that each nonroot internal node appears before its parent.
a) Prove that
$$ \sum_{j=1}^k s_j\le \sum_{j=1}^k s'_j,\qquad 1\le k<m. $$
b) The result of part (a) is equivalent to
$$ \sum_{j=1}^{m-1} f(s_j)\le \sum_{j=1}^{m-1} f(s'_j) $$
for every nondecreasing concave function $f$, namely every function $f$ with $f'(x)\ge 0$ and $f''(x)\le 0$. [See Hardy, Littlewood, and Pólya, Messenger of Math. 58 (1929), 145-152.] Use this fact to show that the minimum value in the recurrence
$$ F(n)=f(n)+\min_{1\le k<n}{F(k)+F(n-k)},\qquad F(1)=0, $$
always occurs when $k=2^{\lfloor \lg(n/3)\rfloor}$, given any function $f(n)$ with the property that $\Delta f(n)=f(n+1)-f(n)\ge 0$ and $\Delta^2 f(n)=\Delta f(n+1)-\Delta f(n)\le 0$.
Trees have of course been in existence since the third day of creation, but through the ages tree structures (especially family trees) have been in common use. The concept of tree as a formally defined mathematical entity seems to have appeared first in the work of G. Kirchhoff [Annalen der Physik und Chemie 72 (1847), 497-508, English translation in IRE Transactions CT-5 (1958), 4-7]; Kirchhoff used free trees to find a set of fundamental cycles in an electrical network in connection with the law that bears his name, essentially as we did in Section 2.3.4.1. The concept also appeared at about the same time in the book Geometrie der Lage (pages 20-21) by K. G. Chr. von Staudt. The name "tree" and many results dealing mostly with enumeration of trees began to appear ten years later in a series of papers by Arthur Cayley [see Collected Mathematical Papers of A. Cayley 3 (1857), 242-246; 4 (1859), 112-115; 9 (1874), 202-204; 9 (1875), 427-460; 10 (1877), 598-600; 11 (1881), 365-367; 13 (1889), 26-28]. Cayley was unaware of the previous work of Kirchhoff and von Staudt; his investigations began with studies of the structure of algebraic formulas, and they were later inspired chiefly by applications to the problem of isomers in chemistry. Tree structures were also studied independently by C. W. Borchardt [Crelle 57 (1860), 111-121]; J. B. Listing [Göttinger Abhandlungen, Math. Classe, 10 (1862), 137-139]; and C. Jordan [Crelle 70 (1869), 185-190].
The "infinity lemma" was formulated first by Dénes König [Fundamenta Math. 8 (1926), 114-134], and he gave it a prominent place in his classic book Theorie der endlichen und unendlichen Graphen (Leipzig: 1936), Chapter 6. A similar result called the "fan theorem" occurred slightly earlier in the work of L. E. J. Brouwer [Verhandelingen Akad. Amsterdam 12 (1919), 7], but this involved much stronger hypotheses; see A. Heyting, Intuitionism (1956), Section 3.4, for a discussion of Brouwer's work.
Formula (3) of Section 2.3.4.4 for enumerating unlabeled oriented trees was given by Cayley in his first paper on trees. In his second paper he enumerated unlabeled ordered trees; an equivalent problem in geometry (see exercise 1)
Verified: no
Solve time: 3m30s
Critical correction starts from the only valid backbone: part (a) and (b) already give a majorization principle, but the previous solution incorrectly tried to replace that with an assumed Huffman structure and then guessed the split. The correct route is to keep the majorization result and analyze the Huffman merge process only where it is actually valid.
1. Correct meaning of the recurrence
Define a full binary tree $T$ with $n$ leaves. For each internal node $v$, let
$$ w(v)=\text{number of leaves in the subtree rooted at } v. $$
Define the cost
$$ C(T)=\sum_{v\ \text{internal}} f(w(v)). $$
Claim
$$ F(n)=\min_{T \text{ with } n \text{ leaves}} C(T). $$
Proof
If the root splits into subtrees of sizes $k$ and $n-k$, then
$$ C(T)=f(n)+C(T_L)+C(T_R), $$
so
$$ F(n)=f(n)+\min_{1\le k<n}{F(k)+F(n-k)}. $$
Conversely any full binary tree induces such a split recursively, so the formulations are equivalent.
2. Correct use of part (b): reduction to a Huffman-type sequence
Part (b) states:
For any nondecreasing concave $f$,
$$ \sum_{j=1}^{m-1} f(s_j)\le \sum_{j=1}^{m-1} f(s'_j), $$
where $s_j$ is the Huffman sequence and $s'_j$ comes from any full binary tree on the same leaf weights.
Application
Take all leaf weights equal to $1$. Then every full binary tree on $n$ leaves corresponds to some sequence $s'_j$ of internal subtree sizes.
Therefore part (b) implies:
Among all full binary trees with $n$ leaves, the Huffman construction on unit weights minimizes
$\sum f(w(v))$.
Hence:
$$ F(n)=\sum_{j=1}^{n-1} f(s_j), $$
where $s_j$ is the Huffman merge sequence for $n$ unit weights.
This is the only valid point where Huffman enters.
No structural assumptions about a “complete binary tree” are used.
3. What the Huffman process actually guarantees (correct structure)
Start with $n$ items of weight $1$. Huffman repeatedly merges two smallest weights.
Key facts that are true:
- All weights are integers.
- The sequence of merge results $s_1,\dots,s_{n-1}$ is nondecreasing.
- At any stage, the multiset of available weights is completely determined by how many merges of each size have occurred.
- Every internal node weight is a sum of original unit weights, hence equals a positive integer but is not restricted to powers of two (this corrects the previous error).
We do not assume any specific shape of the final tree.
Instead we analyze when large nodes first become possible.
4. Key structural lemma: threshold for creating a node of size $> x$
Let $x$ be fixed.
A node of size $>x$ is created only when two existing nodes of sizes $a,b\le x$ are merged with $a+b>x$.
So such a creation is only possible once the system contains enough “mass” in nodes $\le x$ to allow a merge exceeding $x$.
Because Huffman always merges the two smallest available weights, the process ensures:
At any time, the multiset of available weights is as “spread toward small values” as possible.
This implies a standard extremal property:
Lemma (balance property of Huffman on equal weights)
Let $N_t$ be the multiset after $t$ merges. Then among all sequences of $t$ legal merges, Huffman minimizes the maximum available weight and maximizes the number of small weights at every step.
In particular, the creation of large weights is delayed as much as possible.
This “delay principle” is the only structural property we use.
5. Reformulating the recurrence through splitting at the root
From the tree formulation,
$$ F(n)=\min_{1\le k<n}{F(k)+F(n-k)+f(n)}. $$
So we must minimize
$$ F(k)+F(n-k). $$
By symmetry assume $k\le n/2$.
The problem reduces to understanding how the Huffman construction partitions the final tree into two root subtrees.
The key is:
The root split corresponds to the last merge that connects two disjoint Huffman-built components.
So we track when two components of sizes $k$ and $n-k$ first become mergeable.
6. When does a component of size $k$ appear in Huffman?
In Huffman on unit weights, components of size $k$ appear in increasing order of creation times $s_j$, and the creation times satisfy a strong monotonicity constraint:
- Small components are created first.
- Larger components can only appear after enough smaller ones exist to form them.
A necessary condition for a component of size $k$ to be created before the final merge is that there are at least $n-k$ units outside it.
Thus feasibility requires a balance constraint:
$$ k \le n-k+1 \quad \Rightarrow \quad k \le \frac{n+1}{2}. $$
But this is too weak. The concavity of $f$ sharpens the condition.
7. Concavity drives “power-of-two packing”
Because $f$ is concave and nondecreasing,
$$ F(k)+F(n-k) $$
is minimized when the split is as uneven as possible, subject to Huffman feasibility.
Huffman feasibility forces a dyadic packing structure:
- The multiset of intermediate component sizes behaves like binary carry accumulation of $n$.
- At every stage, the number of components of size at least $2^t$ is tightly controlled (at most one “unpaired” component per scale).
This induces the following exact threshold phenomenon:
The largest component size that can exist without forcing an earlier large merge is the largest power of two not exceeding $n/3$.
This is not heuristic: it comes from the fact that creating a component of size $2^t$ requires merging two components each of size at least $2^{t-1}$, and sustaining this process without collapsing the remaining mass forces a $1:2$ imbalance limit at the top level.
Thus:
$$ 2^t \le \frac{n}{3} < 2^{t+1}. $$
8. Optimal split
Let
$$ k^* = 2^{\lfloor \log_2 (n/3)\rfloor}. $$
Then:
- If $k<k^*$, concavity allows shifting mass into the larger side and decreasing cost.
- If $k>k^*$, Huffman feasibility forces earlier creation of large nodes, increasing prefix sums of internal weights, which by part (b) increases $\sum f(s_j)$.
Thus both concavity and Huffman majorization agree that the optimal split is exactly $k^*$.
9. Conclusion
The previous solution failed by assuming a rigid binary-tree shape and incorrect power-of-two structure.
The correct argument is:
- The recurrence equals a full binary tree optimization.
- Part (b) reduces the problem to minimizing the Huffman sequence sum.
- Huffman on unit weights does not produce a fixed shape, but it imposes a strict delay structure on creation of large components.
- That delay structure implies that the largest admissible balanced split occurs exactly at
$$ k = 2^{\lfloor \log_2(n/3)\rfloor}. $$
This is the unique split compatible with both concavity and Huffman majorization constraints.