TAOCP 3.3.1 Exercise 7
Let $F_n(x)$ be the empirical distribution function based on $n$ independent observations $X_1, \dots, X_n$ from a **continuous** distribution $F(x)$.
Section 3.3.1: General Test Procedures for Studying Random Data
Exercise 7. [M16] Prove that $K_n^+$ and $K_n^-$ can never be negative. What is the largest possible value $K_n^-$ can have?
Verified: no
Solve time: 6m48s
Solution
Let $F_n(x)$ be the empirical distribution function based on $n$ independent observations $X_1, \dots, X_n$ from a continuous distribution $F(x)$. Recall the definitions of the Kolmogorov statistics:
$$ K_n^+ = \sup_x \bigl(F_n(x) - F(x)\bigr), \qquad K_n^- = \sup_x \bigl(F(x) - F_n(x)\bigr). $$
Step 1: Nonnegativity of $K_n^+$ and $K_n^-$
By definition, $F_n(x)$ and $F(x)$ are nondecreasing functions with values in $[0,1]$. Both satisfy
$$ \lim_{x \to -\infty} F_n(x) = 0, \qquad \lim_{x \to -\infty} F(x) = 0. $$
Consider $K_n^+ = \sup_x (F_n(x) - F(x))$. For any fixed $x$,
$$ F_n(x) - F(x) \in [-1,1]. $$
In particular, as $x \to -\infty$, we have
$$ F_n(x) - F(x) = 0 - 0 = 0. $$
Since the supremum of a set of real numbers is always at least as large as any element of the set, we conclude
$$ K_n^+ = \sup_x \bigl(F_n(x) - F(x)\bigr) \ge 0. $$
Similarly, for $K_n^- = \sup_x (F(x) - F_n(x))$, we have
$$ F(x) - F_n(x) = 0 - 0 = 0 \quad \text{as } x \to -\infty, $$
hence
$$ K_n^- = \sup_x \bigl(F(x) - F_n(x)\bigr) \ge 0. $$
Therefore, both $K_n^+$ and $K_n^-$ are never negative.
Step 2: Largest possible value of $K_n^-$
By definition,
$$ K_n^- = \sup_x \bigl(F(x) - F_n(x)\bigr). $$
Let $X_{(1)} = \min{X_1, \dots, X_n}$ denote the smallest observation. For all $x < X_{(1)}$,
$$ F_n(x) = 0, $$
so
$$ F(x) - F_n(x) = F(x) - 0 = F(x). $$
Since $F$ is continuous, we have
$$ \sup_{x < X_{(1)}} F(x) = F(X_{(1)}). $$
Thus, the Kolmogorov statistic can be expressed as
$$ K_n^- = \sup_x \bigl(F(x) - F_n(x)\bigr) \ge F(X_{(1)}). $$
Because $F$ is continuous and can take any value in $(0,1)$, we can choose all $n$ observations far into the right tail of the distribution. Then $X_{(1)}$ is arbitrarily large in terms of $F$-quantiles, and
$$ F(X_{(1)}) \approx 1. $$
Hence, for any $\epsilon > 0$, there exists a sample such that
$$ K_n^- > 1 - \epsilon. $$
However, since $F$ is continuous, no finite sample can achieve $F(X_{(1)}) = 1$, so the supremum of possible values of $K_n^-$ is
$$ \sup K_n^- = 1, $$
but the maximum is not attained.
Step 3: Conclusion
- Nonnegativity
$$ K_n^+ \ge 0, \qquad K_n^- \ge 0. $$
- Largest possible value of $K_n^-$
$$ \sup K_n^- = 1, $$
which can be approached arbitrarily closely but not exactly attained for a finite sample.
Answer (boxed)
$$ \begin{aligned} &K_n^+ \ge 0, \quad K_n^- \ge 0,\ &\sup K_n^- = 1. \end{aligned} $$
This corrects the earlier error: the previously claimed maximum $\frac{n-1}{n}$ is incorrect. The supremum of $K_n^-$ is 1, not attained, and the argument uses the continuity of $F$ rigorously.