TAOCP 3.3.1 Exercise 22

Let D_n(u)=\sqrt n\,(G_n(u)-u), \qquad 0\le u\le1, where $G_n$ is the empirical distribution function of the transformed variables

Section 3.3.1: General Test Procedures for Studying Random Data

Exercise 22. [**] [HM46] Investigate the "improved" KS test suggested in the answer to exercise 6.

Verified: yes
Solve time: 6m44s


Let

$$ D_n(u)=\sqrt n,(G_n(u)-u), \qquad 0\le u\le1, $$

where $G_n$ is the empirical distribution function of the transformed variables

$$ U_j=F(X_j). $$

Under the null hypothesis, the $U_j$ are independent and uniformly distributed on $[0,1]$. The ordinary Kolmogorov-Smirnov statistic is

$$ K_n=\sup_{0\le u\le1}|G_n(u)-u| =\frac1{\sqrt n}\sup_{0\le u\le1}|D_n(u)|. $$

The “improved” statistic suggested in the answer to Exercise 6 is

$$ K_n^* =\sup_{0<u<1} \frac{|G_n(u)-u|}{\sqrt{u(1-u)}} =\frac1{\sqrt n} \sup_{0<u<1} \frac{|D_n(u)|}{\sqrt{u(1-u)}}. $$

The factor $\sqrt{u(1-u)}$ is the standard deviation of $G_n(u)-u$, since

$$ \operatorname{Var}(G_n(u)) =\frac{u(1-u)}{n}. $$

Hence the normalization attempts to weight all quantiles equally in standard deviation units. The ordinary KS statistic underweights the tails because $u(1-u)$ is small there.

The essential question is whether the modified supremum is mathematically well behaved.

For fixed $u\in(0,1)$,

$$ \frac{D_n(u)}{\sqrt{u(1-u)}} $$

has asymptotic variance $1$. However, the denominator vanishes at $u=0$ and $u=1$, so the endpoint behavior must be examined carefully.

We first show that the statistic is actually infinite with probability $1$.

Let

$$ U_{(1)}=\min(U_1,\ldots,U_n). $$

For $0<u<U_{(1)}$, no sample points lie below $u$, hence

$$ G_n(u)=0. $$

Therefore

$$ \frac{|G_n(u)-u|}{\sqrt{u(1-u)}} = \frac{u}{\sqrt{u(1-u)}} = \sqrt{\frac{u}{1-u}}. $$

This tends to $0$ as $u\to0$, so the lower endpoint causes no divergence.

Now consider $u\downarrow U_{(1)}$ from above. For $U_{(1)}\le u<U_{(2)}$,

$$ G_n(u)=\frac1n. $$

Hence

$$ \frac{|G_n(u)-u|}{\sqrt{u(1-u)}} = \frac{|1/n-u|}{\sqrt{u(1-u)}}. $$

At $u=U_{(1)}$,

$$ \frac{|1/n-U_{(1)}|}{\sqrt{U_{(1)}(1-U_{(1)})}} \sim \frac1{n\sqrt{U_{(1)}}} $$

when $U_{(1)}$ is small.

But $U_{(1)}$ is typically of order $1/n$; more precisely,

$$ \Pr(U_{(1)}>t)=(1-t)^n. $$

Hence $U_{(1)}\asymp n^{-1}$, and therefore

$$ \frac1{n\sqrt{U_{(1)}}} \asymp \frac1{\sqrt n}. $$

So there is no divergence yet. The singularity appears after multiplication by $\sqrt n$, namely in the normalized process

$$ T_n = \sup_{0<u<1} \frac{|D_n(u)|}{\sqrt{u(1-u)}}. $$

At $u=U_{(1)}$,

$$ D_n(U_{(1)}) = \sqrt n\left(\frac1n-U_{(1)}\right). $$

Since $U_{(1)}$ fluctuates on the scale $1/n$,

$$ |D_n(U_{(1)})| \asymp \frac1{\sqrt n}. $$

Dividing by $\sqrt{U_{(1)}}\asymp n^{-1/2}$ gives a quantity of order $1$. Thus the first order statistic alone does not force divergence.

The true issue is the asymptotic behavior of the empirical process near the endpoints. By Donsker’s theorem,

$$ D_n(u)\Rightarrow B(u), $$

where $B(u)$ is a Brownian bridge with covariance

$$ \operatorname{Cov}(B(s),B(t)) = \min(s,t)-st. $$

Hence the candidate limit would formally be

$$ \sup_{0<u<1} \frac{|B(u)|}{\sqrt{u(1-u)}}. $$

We now analyze this quantity.

As $u\downarrow0$,

$$ B(u) $$

behaves like Brownian motion $W(u)$, since the correction term $uW(1)$ is negligible. Therefore the law of the iterated logarithm implies

$$ \limsup_{u\downarrow0} \frac{|B(u)|} {\sqrt{2u\log\log(1/u)}} =1 \qquad\text{a.s.} $$

Consequently,

$$ \frac{|B(u)|}{\sqrt{u}} \sim \sqrt{2\log\log(1/u)} \to\infty $$

almost surely. Therefore

$$ \sup_{0<u<1} \frac{|B(u)|}{\sqrt{u(1-u)}} =\infty \qquad\text{a.s.} $$

Thus the weighted empirical-process supremum has no finite nondegenerate limit.

Equivalently,

$$ T_n = \sup_{0<u<1} \frac{|D_n(u)|}{\sqrt{u(1-u)}} \xrightarrow[]{P}\infty. $$

Hence the naive “improved KS statistic” is not a valid distribution-free test statistic. The endpoint singularities destroy the existence of a finite limiting distribution.

To obtain a meaningful modification, one must truncate the endpoints. For example,

$$ T_n(\varepsilon) = \sup_{\varepsilon\le u\le1-\varepsilon} \frac{|D_n(u)|}{\sqrt{u(1-u)}}, \qquad 0<\varepsilon<\tfrac12. $$

On every compact subinterval $[\varepsilon,1-\varepsilon]$, the denominator is bounded away from $0$, and therefore

$$ T_n(\varepsilon) \Rightarrow \sup_{\varepsilon\le u\le1-\varepsilon} \frac{|B(u)|}{\sqrt{u(1-u)}}. $$

This limit is finite and well defined.

Another classical modification uses the stronger normalization

$$ \frac{|D_n(u)|} {\sqrt{u(1-u)\log\log(1/(u(1-u)))}}. $$

The law of the iterated logarithm shows that this compensates correctly for the endpoint growth.

The arcsine transformation does not reduce the problem to the ordinary KS statistic. If

$$ u=\sin^2\theta, $$

then

$$ \sqrt{u(1-u)} =\sin\theta\cos\theta, $$

so the denominator becomes algebraically simpler, but the transformed process is not a Brownian bridge with constant variance. Therefore no reduction to the classical Kolmogorov distribution occurs.

The conclusion is that the proposed “improved” KS statistic,

$$ \sup_u\frac{|F_n(u)-F(u)|}{\sqrt{F(u)(1-F(u))}}, $$

is defective because of the singularities at $F(u)=0$ and $F(u)=1$. Without truncation or additional logarithmic normalization, the statistic has no finite limiting distribution. Its apparent increased tail sensitivity is inseparable from this divergence phenomenon.