TAOCP 1.2.10 Exercise 13

Section 1.2.10: Analysis of an Algorithm

Exercise 13. [HM38] A sequence of probability generating functions $G_n(z)$ with means $\mu_n$ and deviations $\sigma_n$ is said to approach a normal distribution if

$$ \lim_{n\to\infty} e^{-it\mu_n/\sigma_n}G_n(e^{it/\sigma_n}) = e^{-t^2/2} $$

for all real values of $t$. Using $G_n(z)$ as given by Eq. (8), show that $G_n(z)$ approaches a normal distribution.

Verified: yes
Solve time: 7m56s

Setup

Let

$$ G_n(z)=\frac1{n!}(z+1)(z+2)\cdots(z+n-1), $$

as given by Eq. (8). Let

$$ \mu_n=\operatorname{mean}(G_n)=H_n-1, \qquad \sigma_n^2=\operatorname{var}(G_n)=H_n-H_n^{(2)}, $$

by Eq. (16).

The exercise asks us to prove that

$$ \lim_{n\to\infty} e^{-it\mu_n/\sigma_n} G_n!\left(e^{it/\sigma_n}\right) = e^{-t^2/2} \tag{1} $$

for every real $t$.

Since

$$ G_n(z)=\prod_{k=2}^{n}\frac{z+k-1}{k}, $$

it is enough to analyze the logarithm of the left-hand side of (1).

Solution

Define

$$ u=\frac{it}{\sigma_n}, \qquad z=e^u. $$

Then

$$ e^{-it\mu_n/\sigma_n}G_n(e^u) = \exp!\Bigl( -\mu_nu+\log G_n(e^u) \Bigr), $$

so we study

$$ L_n = -\mu_nu+\log G_n(e^u). \tag{2} $$

Using Eq. (8),

$$ L_n = -\mu_nu + \sum_{k=2}^{n} \log!\left(1+\frac{e^u-1}{k}\right). \tag{3} $$

Write

$$ \delta=e^u-1. $$

Since $\sigma_n^2=H_n-H_n^{(2)}$ and $H_n\to\infty$,

$$ \sigma_n\to\infty, $$

hence $u\to0$ and

$$ \delta = u+\frac{u^2}{2}+O(u^3). \tag{4} $$

Also,

$$ \log(1+x) = x-\frac{x^2}{2}+O(x^3), \qquad x\to0. \tag{5} $$

Substituting $x=\delta/k$ into (5),

$$ \log!\left(1+\frac{\delta}{k}\right) = \frac{\delta}{k} -\frac{\delta^2}{2k^2} + O!\left(\frac{|\delta|^3}{k^3}\right). \tag{6} $$

Summing (6) from $k=2$ to $n$ gives

$$ \log G_n(e^u) = \delta\sum_{k=2}^{n}\frac1k -\frac{\delta^2}{2} \sum_{k=2}^{n}\frac1{k^2} + R_n, \tag{7} $$

where

$$ |R_n| \le C|\delta|^3 \sum_{k=2}^{\infty}\frac1{k^3} = O(|u|^3). \tag{8} $$

Because $u=it/\sigma_n$ and $\sigma_n\to\infty$,

$$ R_n=o(1). \tag{9} $$

Now

$$ \sum_{k=2}^{n}\frac1k = H_n-1 = \mu_n, \tag{10} $$

hence (7) and (3) yield

$$ L_n = \mu_n(\delta-u) -\frac{\delta^2}{2} \sum_{k=2}^{n}\frac1{k^2} + o(1). \tag{11} $$

From (4),

$$ \delta-u = \frac{u^2}{2}+O(u^3), $$

therefore

$$ \mu_n(\delta-u) = \frac{\mu_nu^2}{2} + o(1). \tag{12} $$

Since

$$ u^2=-\frac{t^2}{\sigma_n^2}, $$

and

$$ \frac{\mu_n}{\sigma_n^2} = \frac{H_n-1}{H_n-H_n^{(2)}} \longrightarrow 1, \tag{13} $$

equation (12) becomes

$$ \mu_n(\delta-u) = -\frac{t^2}{2} \frac{\mu_n}{\sigma_n^2} + o(1) = -\frac{t^2}{2}+o(1). \tag{14} $$

Next,

$$ \delta^2=u^2+O(u^3), $$

$$ -\frac{\delta^2}{2} \sum_{k=2}^{n}\frac1{k^2} = -\frac{u^2}{2} \sum_{k=2}^{n}\frac1{k^2} + o(1). \tag{15} $$

Since

$$ \sum_{k=2}^{\infty}\frac1{k^2}<\infty, $$

and

$$ u^2=-\frac{t^2}{\sigma_n^2}, $$

while $\sigma_n^2\to\infty$, the right-hand side of (15) tends to $0$. Thus

$$ -\frac{\delta^2}{2} \sum_{k=2}^{n}\frac1{k^2} = o(1). \tag{16} $$

Combining (11), (14), and (16),

$$ L_n = -\frac{t^2}{2}+o(1). \tag{17} $$

Exponentiating,

$$ e^{-it\mu_n/\sigma_n} G_n!\left(e^{it/\sigma_n}\right) = e^{L_n} \longrightarrow e^{-t^2/2}. $$

This is precisely the condition defining convergence to a normal distribution. Therefore the probability generating functions $G_n(z)$ approach a normal distribution.

This completes the proof.

∎

Verification

Starting from

$$ \log G_n(e^u) = \sum_{k=2}^{n} \log!\left(1+\frac{e^u-1}{k}\right), $$

the coefficient of $u$ equals

$$ \sum_{k=2}^{n}\frac1k = \mu_n. $$

The factor $e^{-it\mu_n/\sigma_n}=e^{-\mu_nu}$ removes this linear term.

The coefficient of $u^2$ is

$$ \frac12 \left( \sum_{k=2}^{n}\frac1k - \sum_{k=2}^{n}\frac1{k^2} \right) = \frac{\sigma_n^2}{2}, $$

because

$$ \sigma_n^2 = H_n-H_n^{(2)} = \sum_{k=2}^{n} \left(\frac1k-\frac1{k^2}\right). $$

After substituting $u=it/\sigma_n$,

$$ \frac{\sigma_n^2u^2}{2} = -\frac{t^2}{2}. $$

All higher-order terms contain powers $u^m$ with $m\ge3$ and therefore vanish as $\sigma_n\to\infty$. The limiting exponent is thus $-t^2/2$, confirming the result independently.

Notes

The proof is an instance of the general method of cumulants. By Exercise 12, the logarithm of a probability generating function encodes the semi-invariants. For

$$ G_n(z)=\prod_{k=2}^{n}\frac{z+k-1}{k}, $$

the first semi-invariant is $\mu_n$, the second is $\sigma_n^2$, and every higher normalized semi-invariant tends to $0$ because $\sigma_n^2\sim H_n\to\infty$ while the higher cumulants remain bounded by convergent series $\sum k^{-r}$ for $r\ge2$. Hence the standardized distribution tends to the normal law.