TAOCP 3.3.1 Exercise 10

Let the original observations produce counts $Y_1,\ldots,Y_k$, with probabilities $p_1,\ldots,p_k$.

Section 3.3.1: General Test Procedures for Studying Random Data

Exercise 10. [**] [20] Suppose a chi-square test is done by making $n$ observations, and the value $V$ is obtained. Now we repeat the test on these same $n$ observations over again (getting, of course, the same results), and we put together the data from both tests, regarding it as a single chi-square test with $2n$ observations. (This procedure violates the text's stipulation that all of the observations must be independent of one another.) How is the second value of $V$ related to the first one?

Verified: yes
Solve time: 3m15s


Let the original observations produce counts $Y_1,\ldots,Y_k$, with probabilities $p_1,\ldots,p_k$. The original chi-square statistic is

$$ V=\sum_{s=1}^k \frac{(Y_s-np_s)^2}{np_s}. $$

When the same data are repeated and combined, the new experiment has $2n$ observations, and the counts in each category become $2Y_s$. Hence the new statistic is

$$ V'=\sum_{s=1}^k \frac{(2Y_s-2np_s)^2}{2np_s}. $$

Since

$$ (2Y_s-2np_s)^2=4(Y_s-np_s)^2, $$

we obtain

$$ V' =\sum_{s=1}^k \frac{4(Y_s-np_s)^2}{2np_s} =2\sum_{s=1}^k \frac{(Y_s-np_s)^2}{np_s} =2V. $$

Therefore the second value of the chi-square statistic is exactly twice the first:

$$ \boxed{V'=2V.} $$