TAOCP 1.2.10 Exercise 6

For a probability generating function G(z)=\sum_{k\ge0}p_k z^k, the third semi-invariant (third cumulant) is

Section 1.2.10: Analysis of an Algorithm

Exercise 6. [HM27] We've computed the mean and the variance of the important probability distributions (8), (18), (20). What is the third semi-invariant, $\kappa_3$, in each of those cases?

Verified: yes
Solve time: 2m03s


Setup

For a probability generating function

$$ G(z)=\sum_{k\ge0}p_k z^k, $$

the third semi-invariant (third cumulant) is

$$ \kappa_3 = \left.\frac{d^3}{dt^3}\log G(e^t)\right|_{t=0}. $$

Since

$$ K(t)=\log G(e^t), $$

we have

$$ K'(0)=\mu, \qquad K''(0)=\sigma^2, \qquad K'''(0)=\kappa_3. $$

For a product of generating functions,

$$ G(z)=G_1(z)G_2(z), $$

the cumulants add, because

$$ \log G(e^t) = \log G_1(e^t)+\log G_2(e^t). $$

Hence it suffices to compute $\kappa_3$ for the elementary factors appearing in (8) and (18).

For the distribution (20), we compute directly from its generating function.

Solution

Distribution (8)

Equation (8) gives

$$ G_n(z)=\prod_{k=2}^{n}Q_k(z), \qquad Q_k(z)=\frac{z+k-1}{k}. $$

Let

$$ F_k(t)=\log Q_k(e^t) = \log(e^t+k-1)-\log k. $$

Write

$$ a=k-1. $$

Then

$$ F_k(t)=\log(a+e^t)-\log k. $$

Differentiation yields

$$ F_k'(t)=\frac{e^t}{a+e^t}, $$

$$ F_k''(t)=\frac{ae^t}{(a+e^t)^2}, $$

$$ F_k'''(t) = \frac{ae^t(a-e^t)}{(a+e^t)^3}. $$

At $t=0$,

$$ F_k'''(0) = \frac{a(a-1)}{(a+1)^3} = \frac{(k-1)(k-2)}{k^3}. $$

Therefore

$$ \kappa_3(G_n) = \sum_{k=2}^{n} \frac{(k-1)(k-2)}{k^3}. $$

Since

$$ \frac{(k-1)(k-2)}{k^3} = \frac1k-\frac3{k^2}+\frac2{k^3}, $$

we obtain

$$ \kappa_3 = \sum_{k=2}^{n} \left( \frac1k-\frac3{k^2}+\frac2{k^3} \right). $$

Using harmonic numbers,

$$ H_n=\sum_{k=1}^{n}\frac1k, \qquad H_n^{(r)} = \sum_{k=1}^{n}\frac1{k^r}, $$

this becomes

$$ \kappa_3 = (H_n-1) -3(H_n^{(2)}-1) +2(H_n^{(3)}-1). $$

Hence

$$ \boxed{\kappa_3 = H_n-3H_n^{(2)}+2H_n^{(3)}.} $$

Distribution (18)

Equation (18) gives

$$ G_n(z)=(q+pz)^n. $$

Since cumulants add under products,

$$ \kappa_3(G_n) = n,\kappa_3(q+pz). $$

Let

$$ F(t)=\log(q+pe^t). $$

Differentiation gives

$$ F'(t) = \frac{pe^t}{q+pe^t}, $$

$$ F''(t) = \frac{pq,e^t}{(q+pe^t)^2}, $$

$$ F'''(t) = \frac{pq,e^t(q-pe^t)} {(q+pe^t)^3}. $$

At $t=0$,

$$ F'''(0) = pq(q-p). $$

Therefore

$$ \boxed{\kappa_3 = npq(q-p).} $$

Distribution (20)

Equation (20) corresponds to the discrete uniform distribution on

$$ {1,2,\ldots,n}. $$

Its generating function is

$$ G(z) = \frac1n \sum_{j=1}^{n}z^j. $$

The distribution is symmetric about

$$ \frac{n+1}{2}, $$

because the probabilities of $j$ and $n+1-j$ are equal.

Hence every odd central moment vanishes:

$$ \sum_{j=1}^{n} \left(j-\frac{n+1}{2}\right)^3 \frac1n = 0. $$

For the third semi-invariant,

$$ \kappa_3 = E!\left[(X-\mu)^3\right], $$

since the third cumulant equals the third central moment. Therefore

$$ \boxed{\kappa_3=0.} $$

Verification

For distribution (8), the elementary factor $Q_k(z)$ has

$$ \kappa_1=\frac1k, \qquad \kappa_2=\frac1k-\frac1{k^2}, \qquad \kappa_3=\frac1k-\frac3{k^2}+\frac2{k^3}. $$

Summing over $k=2,\ldots,n$ reproduces

$$ \kappa_1=H_n-1, \qquad \kappa_2=H_n-H_n^{(2)}, $$

which agrees with equation (16), confirming the calculation.

For distribution (18), when $p=q=\tfrac12$,

$$ \kappa_3=npq(q-p)=0, $$

as required by the symmetry of the binomial distribution.

For distribution (20), the symmetry

$$ j-\frac{n+1}{2} = -\left((n+1-j)-\frac{n+1}{2}\right) $$

pairs every term in the third central moment with its negative, yielding $0$.

Notes

The three third semi-invariants are

$$ \boxed{ \kappa_3 = H_n-3H_n^{(2)}+2H_n^{(3)} } \qquad\text{for (8),} $$

$$ \boxed{ \kappa_3 = npq(q-p) } \qquad\text{for (18),} $$

$$ \boxed{ \kappa_3 = 0 } \qquad\text{for (20).} $$