TAOCP 1.2.10 Exercise 17

Let f(z)=\sum_{k\ge 0} a_k z^k, \qquad g(z)=\sum_{n\ge 0} b_n z^n be generating functions of probability distributions, so $a_k \ge 0$, $b_n \ge 0$, and

Section 1.2.10: Analysis of an Algorithm

Exercise 17. [M27] Let $f(z)$ and $g(z)$ be generating functions that represent probability distributions.

a) Show that $h(z)=g(f(z))$ is also a generating function representing a probability distribution.

b) Interpret the significance of $h(z)$ in terms of $f(z)$ and $g(z)$.

c) Give formulas for the mean and variance of $h$ in terms of those for $f$ and $g$.

Verified: yes
Solve time: 1m06s


Setup

Let

$$ f(z)=\sum_{k\ge 0} a_k z^k, \qquad g(z)=\sum_{n\ge 0} b_n z^n $$

be generating functions of probability distributions, so $a_k \ge 0$, $b_n \ge 0$, and

$$ f(1)=1, \qquad g(1)=1. $$

Define

$$ h(z)=g(f(z)). $$

Let $X$ be the random variable with distribution $(a_k)$ and let $N$ be the random variable with distribution $(b_n)$. Let $X_1,X_2,\ldots$ be independent copies of $X$, independent of $N$. Define

$$ S = X_1 + X_2 + \cdots + X_N. $$

The task is to show that $h(z)$ is a generating function of a probability distribution, interpret $h$, and express the mean and variance of $h$ in terms of those of $f$ and $g$.

Solution

(a) $h(z)$ is a generating function

Since $f(1)=1$ and $g(1)=1$, substitution yields

$$ h(1)=g(f(1))=g(1)=1. $$

Write

$$ g(w)=\sum_{n\ge 0} b_n w^n. $$

Then

$$ h(z)=\sum_{n\ge 0} b_n (f(z))^n. $$

Each coefficient of $(f(z))^n$ is a finite sum of products of the nonnegative numbers $a_k$, hence nonnegative. Therefore each coefficient of $h(z)$ is nonnegative.

Expanding $(f(z))^n$ corresponds to the distribution of a sum of $n$ independent copies of $X$. Hence $h(z)$ has a power-series expansion with nonnegative coefficients summing to $1$, so it represents a probability distribution.

(b) Interpretation of $h(z)$

From the expansion

$$ h(z)=\sum_{n\ge 0} b_n (f(z))^n, $$

the coefficient of $z^k$ in $(f(z))^n$ is the probability that

$$ X_1+\cdots+X_n = k. $$

Multiplying by $b_n$ weights this event by the probability that $N=n$. Hence the coefficient of $z^k$ in $h(z)$ is

$$ \Pr(S=k). $$

Thus $h(z)$ is the generating function of the compound random variable

$$ S=X_1+\cdots+X_N, $$

where $N$ is random and determines the number of summands.

(c) Mean and variance

Let

$$ \mu_f = f'(1), \qquad \mu_g = g'(1). $$

Let

$$ \sigma_f^2 = f''(1)+f'(1)-f'(1)^2, \qquad \sigma_g^2 = g''(1)+g'(1)-g'(1)^2. $$

Mean

Condition on $N$:

$$ \mathbb{E}[S \mid N=n] = n \mu_f. $$

Taking expectation over $N$,

$$ \mathbb{E}[S] = \mathbb{E}[N]\mu_f = \mu_g \mu_f. $$

Hence

$$ h'(1)=\mu_g \mu_f. $$

Variance

Using conditional variance,

$$ \mathrm{Var}(S)=\mathbb{E}[\mathrm{Var}(S\mid N)] + \mathrm{Var}(\mathbb{E}[S\mid N]). $$

For fixed $N=n$,

$$ \mathrm{Var}(S\mid N=n)=n\sigma_f^2. $$

Thus

$$ \mathbb{E}[\mathrm{Var}(S\mid N)] = \mathbb{E}[N]\sigma_f^2 = \mu_g \sigma_f^2. $$

Also,

$$ \mathbb{E}[S\mid N]=N\mu_f, $$

so

$$ \mathrm{Var}(\mathbb{E}[S\mid N])=\mathrm{Var}(N\mu_f)=\mu_f^2 \sigma_g^2. $$

Therefore

$$ \mathrm{Var}(S)=\mu_g \sigma_f^2 + \mu_f^2 \sigma_g^2. $$

In terms of derivatives,

$$ \mathrm{Var}(h)=\left(g'(1)\right)\left(f''(1)+f'(1)-f'(1)^2\right) +\left(f'(1)\right)^2\left(g''(1)+g'(1)-g'(1)^2\right). $$

Verification

The identity $h(1)=1$ follows directly from $f(1)=g(1)=1$ under substitution.

Nonnegativity of coefficients follows from closure of nonnegative coefficient series under multiplication and composition, since $(f(z))^n$ expands into finite sums of products of $a_k$.

The conditional expectation satisfies linearity because each $X_i$ has mean $f'(1)$ and independence ensures additivity over fixed $n$.

The variance decomposition

$$ \mathrm{Var}(S)=\mathbb{E}[\mathrm{Var}(S\mid N)] + \mathrm{Var}(\mathbb{E}[S\mid N]) $$

holds by expansion of $\mathbb{E}[(S-\mathbb{E}S)^2]$ and grouping terms by conditioning on $N$.

Notes

The construction identifies $h$ as the probability generating function of a compound distribution, where $g$ governs the number of summands and $f$ governs their common distribution. The mean and variance formulas match the general compound-sum identities:

$$ \mathbb{E}[S]=\mathbb{E}[N]\mathbb{E}[X], \qquad \mathrm{Var}(S)=\mathbb{E}[N]\mathrm{Var}(X)+\mathrm{Var}(N)(\mathbb{E}[X])^2. $$

This completes the proof. ∎