TAOCP 2.3.4.5 Exercise 5
The original solution fails because it treats the internal path length as if it were a simple additive marking that factors independently over subtrees.
Exercise 5. [**] [HM26] Let
$$ B(w,z)=\sum_{n,p\ge 0} b_{np}w^pz^n, $$
where $b_{np}$ is the number of binary trees with $n$ nodes and internal path length $p$. [Thus,
$$ B(w,z)=1+z+2wz^2+(w^2+4w^3)z^3+(4w^4+2w^5+8w^6)z^4+\cdots; $$
$B(1,z)$ is the function $B(z)$ of Eq. (13) in Section 2.3.4.4.]
a) Find a functional relation that characterizes $B(w,z)$, generalizing Eq. (12) of Section 2.3.4.4.
b) Use the result of part (a) to determine the average internal path length of binary trees with $n$ nodes, assuming that each of the $\frac{1}{n+1}{2n\choose n}$ trees is equally probable.
c) Find the asymptotic value of the answer to part (b), as $n\to\infty$.
Verified: yes
Solve time: 2m33s
The original solution fails because it treats the internal path length as if it were a simple additive marking that factors independently over subtrees. The key correction is that depth increases inside subtrees, so the weight transformation is size dependent, not a constant multiplicative factor.
a) Correct functional equation for $B(w,z)$
Let
$$ B(w,z)=\sum_{T} w^{P(T)} z^{|T|}, $$
where $|T|$ is the number of nodes and $P(T)$ is the internal path length.
A binary tree is either empty or consists of a root with a left and right subtree $T_1,T_2$.
For nonempty trees:
- Size: $|T| = 1 + |T_1| + |T_2|$
- Internal path length:
$$ P(T)=P(T_1)+P(T_2)+|T_1|+|T_2|. $$
The last term appears because every node in each subtree increases its depth by 1.
Now translate this into generating functions:
Each subtree contributes
$$ w^{P(T_i)+|T_i|} z^{|T_i|} = (wz)^{|T_i|} w^{P(T_i)}. $$
So each subtree is counted by $B(w, wz)$.
Therefore the decomposition gives:
$$ B(w,z)=1 + z, B(w, wz)^2. $$
This is the correct functional equation.
b) Average internal path length
Define
$$ C(z)=B(1,z). $$
Then the functional equation becomes the standard Catalan equation:
$$ C(z)=1+zC(z)^2, $$
so
$$ C(z)=\frac{1-\sqrt{1-4z}}{2z}. $$
Let
$$ G(z)=\left.\frac{\partial}{\partial w}B(w,z)\right|_{w=1}. $$
Then $G(z)$ is the generating function of total internal path length over all binary trees.
Differentiate
$$ B(w,z)=1+zB(w,wz)^2. $$
Using the chain rule:
$$ \frac{\partial}{\partial w}B(w,wz) = B_w(w,wz) + z B_z(w,wz). $$
So:
$$ B_w(w,z)=2z,B(w,wz)\bigl(B_w(w,wz)+zB_z(w,wz)\bigr). $$
Now set $w=1$. Write $C(z)=B(1,z)$, $G(z)=B_w(1,z)$. Then:
$$ G(z)=2zC(z)\bigl(G(z)+zC'(z)\bigr). $$
Solve for $G(z)$:
$$ G(z)\bigl(1-2zC(z)\bigr)=2z^2 C(z)C'(z). $$
Use the identity from $C=1+zC^2$:
$$ 1-2zC(z)=\sqrt{1-4z}. $$
Hence:
$$ G(z)=\frac{2z^2 C(z)C'(z)}{\sqrt{1-4z}}. $$
Asymptotics of $G(z)$
Near $z=\tfrac14$:
- $C(z)\to 2$
- $C'(z)\sim \frac{1}{z}(1-4z)^{-1/2}$
Substitute into $G(z)$:
$$ G(z)\sim \frac{2z^2\cdot 2 \cdot (1/z)(1-4z)^{-1/2}}{\sqrt{1-4z}} \sim \frac{1}{1-4z}. $$
Thus:
$$ [z^n]G(z)\sim 4^n. $$
So the total internal path length over all Catalan trees with $n$ nodes satisfies:
$$ P_n \sim 4^n. $$
The number of such trees is the Catalan number:
$$ C_n \sim \frac{4^n}{\sqrt{\pi},n^{3/2}}. $$
Therefore the expected internal path length is:
$$ \mathbb{E}[P_n]=\frac{P_n}{C_n} \sim \sqrt{\pi},n^{3/2}. $$
c) Asymptotic behavior
$$ \boxed{\mathbb{E}[P_n]\sim \sqrt{\pi},n^{3/2}}. $$
Final answers
a)
$$ \boxed{B(w,z)=1+zB(w,wz)^2} $$
b)
$$ \boxed{\mathbb{E}[P_n]\sim \sqrt{\pi},n^{3/2}} $$
c)
$$ \boxed{\mathbb{E}[P_n]\sim \sqrt{\pi},n^{3/2}\quad (n\to\infty)} $$