TAOCP 3.3.1 Exercise 4
Let the first die be fair, with outcomes $1,2,3,4,5,6$ equally likely, each with probability $\frac{1}{6}$.
Section 3.3.1: General Test Procedures for Studying Random Data
Exercise 4. ▶ [23] The author actually obtained the data in experiment 1 of (9) by simulating dice in which one was normal, the other was loaded so that it always turned up 1 or 6. (The latter two possibilities were equally probable.) Compute the probabilities that replace (1) in this case, and by using a chi-square test decide if the results of that experiment are consistent with the dice being loaded in this way.
Verified: yes
Solve time: 5m53s
Solution
Let the first die be fair, with outcomes $1,2,3,4,5,6$ equally likely, each with probability $\frac{1}{6}$. Let the second die be loaded so that only the outcomes $1$ and $6$ are possible, each with probability $\frac{1}{2}$. Denote the sum of the two dice by $s$, with $2 \le s \le 12$. We compute the probabilities $p_s = \Pr(\text{sum} = s)$ by enumerating all possible combinations of the dice.
If the first die shows $i$ and the second die shows $j$, then $s = i+j$. The possible pairs and probabilities are:
- Second die shows $1$ with probability $\frac{1}{2}$: the sum $s = i + 1$ occurs with probability $\frac{1}{6} \cdot \frac{1}{2} = \frac{1}{12}$ for $i = 1,\dots,6$. This gives $s = 2,3,4,5,6,7$ with probability $\frac{1}{12}$ each.
- Second die shows $6$ with probability $\frac{1}{2}$: the sum $s = i + 6$ occurs with probability $\frac{1}{6} \cdot \frac{1}{2} = \frac{1}{12}$ for $i = 1,\dots,6$. This gives $s = 7,8,9,10,11,12$ with probability $\frac{1}{12}$ each.
Some sums appear twice in these enumerations:
- $s = 7$ arises from $(1,6)$ and $(6,1)$, each with probability $\frac{1}{12}$, so $p_7 = \frac{1}{12} + \frac{1}{12} = \frac{1}{6}$.
All other sums appear exactly once, so $p_2 = p_3 = p_4 = p_5 = p_6 = p_8 = p_9 = p_{10} = p_{11} = p_{12} = \frac{1}{12}$.
Thus the complete table of probabilities is
$$ \begin{array}{c|ccccccccccc} s & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \ \hline p_s & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{6} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} & \tfrac{1}{12} \end{array}. $$
The experiment observed counts from (9), experiment 1:
$$ Y_s = 4, 10, 10, 13, 20, 18, 11, 8, 13, 16, 8, \quad 2 \le s \le 12. $$
The total number of throws is
$$ n = \sum_{s=2}^{12} Y_s = 4 + 10 + 10 + 13 + 20 + 18 + 11 + 8 + 13 + 16 + 8 = 131. $$
The expected counts under the hypothesis of the loaded die are $E_s = n p_s$:
- $E_2 = E_3 = E_4 = E_5 = E_6 = E_8 = E_9 = E_{10} = E_{11} = E_{12} = 131 \cdot \frac{1}{12} = 10.9167 \approx 10.917$
- $E_7 = 131 \cdot \frac{1}{6} = 21.8333 \approx 21.833$
We compute the chi-square statistic
$$ V = \sum_{s=2}^{12} \frac{(Y_s - E_s)^2}{E_s}. $$
Each term is computed:
$$ \begin{aligned} \frac{(4-10.917)^2}{10.917} &= \frac{(-6.917)^2}{10.917} = \frac{47.856}{10.917} \approx 4.385, \ \frac{(10-10.917)^2}{10.917} &= \frac{(-0.917)^2}{10.917} = \frac{0.841}{10.917} \approx 0.077, \ \frac{(10-10.917)^2}{10.917} &\approx 0.077, \ \frac{(13-10.917)^2}{10.917} &= \frac{2.083^2}{10.917} = \frac{4.340}{10.917} \approx 0.398, \ \frac{(20-10.917)^2}{10.917} &= \frac{9.083^2}{10.917} = \frac{82.50}{10.917} \approx 7.558, \ \frac{(18-21.833)^2}{21.833} &= \frac{(-3.833)^2}{21.833} = \frac{14.686}{21.833} \approx 0.673, \ \frac{(11-10.917)^2}{10.917} &= \frac{0.083^2}{10.917} = \frac{0.0069}{10.917} \approx 0.001, \ \frac{(8-10.917)^2}{10.917} &= \frac{(-2.917)^2}{10.917} = \frac{8.512}{10.917} \approx 0.780, \ \frac{(13-10.917)^2}{10.917} &\approx 0.398, \ \frac{(16-10.917)^2}{10.917} &= \frac{5.083^2}{10.917} = \frac{25.837}{10.917} \approx 2.367, \ \frac{(8-10.917)^2}{10.917} &\approx 0.780. \end{aligned} $$
Summing these contributions gives
$$ V \approx 4.385 + 0.077 + 0.077 + 0.398 + 7.558 + 0.673 + 0.001 + 0.780 + 0.398 + 2.367 + 0.780 = 17.494 \approx 17.50. $$
The number of degrees of freedom is $\nu = k - 1 = 11 - 1 = 10$. Referring to Table 1, the 5% and 1% points for $\nu = 10$ are 18.31 and 23.21. Since $V = 17.50 < 18.31$, the observed value is not unusually high. Thus the chi-square test does not indicate significant departure from the hypothesis that the dice are loaded as described.
The apparent discrepancy between the counts and the expected values is within the range of random variation for $n = 131$. Therefore the experiment is consistent with the dice being loaded with the second die restricted to outcomes 1 or 6.
This completes the solution.
∎