TAOCP 3.3.2 Exercise 4
Let p=\beta-\alpha .
Section 3.3.2: Empirical Tests
Exercise 4. [M12] Prove that the probabilities in (4) are correct for the gap test.
Verified: yes
Solve time: 4m39s
Solution
Let
$$ p=\beta-\alpha . $$
For a random sequence, each value $U_j$ is independently distributed on $[0,1)$, hence
$$ \Pr(\alpha \le U_j < \beta)=p, \qquad \Pr(U_j<\alpha \text{ or } U_j\ge\beta)=1-p. $$
A gap of length $r$ is defined to be a subsequence
$$ U_j,U_{j+1},\ldots,U_{j+r} $$
such that
$$ U_j,\ldots,U_{j+r-1}\notin[\alpha,\beta), \qquad U_{j+r}\in[\alpha,\beta). $$
Therefore a gap of length $r$ occurs precisely when the first $r$ observations fail to lie in the interval $[\alpha,\beta)$ and the next observation lies in that interval.
Since the observations are independent, the probability of this event is the product of the corresponding probabilities:
$$ p_r = (1-p)^r,p, \qquad r\ge 0. $$
For the categories used by Algorithm G, the lengths $0,1,\ldots,t-1$ are recorded separately. Hence, for $0\le r<t$,
$$ p_r=p(1-p)^r. $$
The final category combines all gaps of length at least $t$. Its probability is
$$ p_t = \sum_{r=t}^{\infty} p(1-p)^r = p(1-p)^t\sum_{k=0}^{\infty}(1-p)^k. $$
Since
$$ \sum_{k=0}^{\infty}(1-p)^k = \frac1p, $$
we obtain
$$ p_t=(1-p)^t. $$
Thus the probabilities used in the gap test are
$$ p_r=p(1-p)^r, \qquad 0\le r\le t-1, $$
and
$$ p_t=(1-p)^t, $$
which are exactly the probabilities given in equation (4).
This completes the proof.
∎