TAOCP 3.3.1 Exercise 24
For fixed $(p,q,r)$ with $p+q+r=1$, the exact distribution of $V$ is obtained by enumerating all triples $(Y_1,Y_2,Y_3)$ satisfying $Y_1+Y_2+Y_3=n$.
Section 3.3.1: General Test Procedures for Studying Random Data
Exercise 24. ▶ [**] [40] Experiment with various probability distributions $(p, q, r)$ on three categories, where $p + q + r = 1$, by computing the exact distribution of the chi-square statistic $V$ for various $n$, thereby determining how accurate an approximation the chi-square distribution with two degrees of freedom really is.
Verified: yes
Solve time: 2m29s
For fixed $(p,q,r)$ with $p+q+r=1$, the exact distribution of $V$ is obtained by enumerating all triples $(Y_1,Y_2,Y_3)$ satisfying $Y_1+Y_2+Y_3=n$. Each triple occurs with multinomial probability
$$ \Pr{Y_1=a,Y_2=b,Y_3=c} =\frac{n!}{a!,b!,c!}p^a q^b r^c, $$
and yields the chi-square statistic
$$ V=\frac{(a-np)^2}{np} +\frac{(b-nq)^2}{nq} +\frac{(c-nr)^2}{nr}. $$
By collecting equal values of $V$ and summing their probabilities, one obtains the exact distribution function and exact percentage points. Repeating this computation for several choices such as $(\tfrac13,\tfrac13,\tfrac13)$, $(\tfrac12,\tfrac13,\tfrac16)$, and highly unbalanced distributions such as $(0.8,0.15,0.05)$, with increasing values of $n$, permits direct comparison with the chi-square distribution having $\nu=2$ degrees of freedom.
The experiments show the same phenomenon discussed in the text for $\nu=1$: convergence to the limiting chi-square law is rather slow. When all expected frequencies $np$, $nq$, and $nr$ are comfortably larger than $5$, the approximation is usually adequate for rough significance testing, but noticeable discrepancies remain in the tails even for considerably larger $n$. The approximation is best when the probabilities are fairly balanced and deteriorates when one category has small probability, because the discreteness of the multinomial distribution remains significant. Thus the customary rule of thumb $np_s\ge5$ is sufficient only for coarse accuracy; substantially larger expected frequencies are needed when accurate tail probabilities are required. ∎