TAOCP 1.2.10 Exercise 17
Let f(z)=\sum_{k\ge 0} a_k z^k, \qquad g(z)=\sum_{n\ge 0} b_n z^n be generating functions of probability distributions, so $a_k \ge 0$, $b_n \ge 0$, and
Section 1.2.10: Analysis of an Algorithm
Exercise 17. [M27] Let $f(z)$ and $g(z)$ be generating functions that represent probability distributions.
a) Show that $h(z)=g(f(z))$ is also a generating function representing a probability distribution.
b) Interpret the significance of $h(z)$ in terms of $f(z)$ and $g(z)$.
c) Give formulas for the mean and variance of $h$ in terms of those for $f$ and $g$.
Verified: yes
Solve time: 1m06s
Setup
Let
$$ f(z)=\sum_{k\ge 0} a_k z^k, \qquad g(z)=\sum_{n\ge 0} b_n z^n $$
be generating functions of probability distributions, so $a_k \ge 0$, $b_n \ge 0$, and
$$ f(1)=1, \qquad g(1)=1. $$
Define
$$ h(z)=g(f(z)). $$
Let $X$ be the random variable with distribution $(a_k)$ and let $N$ be the random variable with distribution $(b_n)$. Let $X_1,X_2,\ldots$ be independent copies of $X$, independent of $N$. Define
$$ S = X_1 + X_2 + \cdots + X_N. $$
The task is to show that $h(z)$ is a generating function of a probability distribution, interpret $h$, and express the mean and variance of $h$ in terms of those of $f$ and $g$.
Solution
(a) $h(z)$ is a generating function
Since $f(1)=1$ and $g(1)=1$, substitution yields
$$ h(1)=g(f(1))=g(1)=1. $$
Write
$$ g(w)=\sum_{n\ge 0} b_n w^n. $$
Then
$$ h(z)=\sum_{n\ge 0} b_n (f(z))^n. $$
Each coefficient of $(f(z))^n$ is a finite sum of products of the nonnegative numbers $a_k$, hence nonnegative. Therefore each coefficient of $h(z)$ is nonnegative.
Expanding $(f(z))^n$ corresponds to the distribution of a sum of $n$ independent copies of $X$. Hence $h(z)$ has a power-series expansion with nonnegative coefficients summing to $1$, so it represents a probability distribution.
(b) Interpretation of $h(z)$
From the expansion
$$ h(z)=\sum_{n\ge 0} b_n (f(z))^n, $$
the coefficient of $z^k$ in $(f(z))^n$ is the probability that
$$ X_1+\cdots+X_n = k. $$
Multiplying by $b_n$ weights this event by the probability that $N=n$. Hence the coefficient of $z^k$ in $h(z)$ is
$$ \Pr(S=k). $$
Thus $h(z)$ is the generating function of the compound random variable
$$ S=X_1+\cdots+X_N, $$
where $N$ is random and determines the number of summands.
(c) Mean and variance
Let
$$ \mu_f = f'(1), \qquad \mu_g = g'(1). $$
Let
$$ \sigma_f^2 = f''(1)+f'(1)-f'(1)^2, \qquad \sigma_g^2 = g''(1)+g'(1)-g'(1)^2. $$
Mean
Condition on $N$:
$$ \mathbb{E}[S \mid N=n] = n \mu_f. $$
Taking expectation over $N$,
$$ \mathbb{E}[S] = \mathbb{E}[N]\mu_f = \mu_g \mu_f. $$
Hence
$$ h'(1)=\mu_g \mu_f. $$
Variance
Using conditional variance,
$$ \mathrm{Var}(S)=\mathbb{E}[\mathrm{Var}(S\mid N)] + \mathrm{Var}(\mathbb{E}[S\mid N]). $$
For fixed $N=n$,
$$ \mathrm{Var}(S\mid N=n)=n\sigma_f^2. $$
Thus
$$ \mathbb{E}[\mathrm{Var}(S\mid N)] = \mathbb{E}[N]\sigma_f^2 = \mu_g \sigma_f^2. $$
Also,
$$ \mathbb{E}[S\mid N]=N\mu_f, $$
so
$$ \mathrm{Var}(\mathbb{E}[S\mid N])=\mathrm{Var}(N\mu_f)=\mu_f^2 \sigma_g^2. $$
Therefore
$$ \mathrm{Var}(S)=\mu_g \sigma_f^2 + \mu_f^2 \sigma_g^2. $$
In terms of derivatives,
$$ \mathrm{Var}(h)=\left(g'(1)\right)\left(f''(1)+f'(1)-f'(1)^2\right) +\left(f'(1)\right)^2\left(g''(1)+g'(1)-g'(1)^2\right). $$
Verification
The identity $h(1)=1$ follows directly from $f(1)=g(1)=1$ under substitution.
Nonnegativity of coefficients follows from closure of nonnegative coefficient series under multiplication and composition, since $(f(z))^n$ expands into finite sums of products of $a_k$.
The conditional expectation satisfies linearity because each $X_i$ has mean $f'(1)$ and independence ensures additivity over fixed $n$.
The variance decomposition
$$ \mathrm{Var}(S)=\mathbb{E}[\mathrm{Var}(S\mid N)] + \mathrm{Var}(\mathbb{E}[S\mid N]) $$
holds by expansion of $\mathbb{E}[(S-\mathbb{E}S)^2]$ and grouping terms by conditioning on $N$.
Notes
The construction identifies $h$ as the probability generating function of a compound distribution, where $g$ governs the number of summands and $f$ governs their common distribution. The mean and variance formulas match the general compound-sum identities:
$$ \mathbb{E}[S]=\mathbb{E}[N]\mathbb{E}[X], \qquad \mathrm{Var}(S)=\mathbb{E}[N]\mathrm{Var}(X)+\mathrm{Var}(N)(\mathbb{E}[X])^2. $$
This completes the proof. ∎