TAOCP 3.3.1 Exercise 6

Let $F_n(x)$ be the empirical distribution function defined by equation (10) of Section 3.

Section 3.3.1: General Test Procedures for Studying Random Data

Exercise 6. [M20] Consider $F_n(x)$, as given in Eq. (10), for fixed $x$. What is the probability that $F_n(x) = s/n$, given an integer $s$? What is the mean value of $F_n(x)$? What is the standard deviation?

Verified: yes
Solve time: 2m41s


Solution

Let $F_n(x)$ be the empirical distribution function defined by equation (10) of Section 3.3.1:

$F_n(x) = \frac{1}{n} \sum_{i=1}^{n} I{X_i \le x},$

where $I{X_i \le x}$ is the indicator function equal to $1$ if $X_i \le x$ and $0$ otherwise.

For fixed $x$, the quantity $F_n(x)$ counts the fraction of the $n$ independent observations that do not exceed $x$. Let $s$ be the number of observations satisfying $X_i \le x$. Then $F_n(x) = s/n$. Each observation $X_i$ is independent and satisfies $\Pr(X_i \le x) = F(x)$, the true cumulative distribution function at $x$.

Consequently, the probability that exactly $s$ of the $n$ observations satisfy $X_i \le x$ is the binomial probability:

$\Pr\bigl(F_n(x) = s/n\bigr) = \binom{n}{s} ,[F(x)]^s ,[1 - F(x)]^{n-s}, \qquad s = 0,1,\dots,n.$

The mean value of $F_n(x)$ is the expected value of the fraction of successes in $n$ independent Bernoulli trials, each with probability $F(x)$. By linearity of expectation:

$\mathbb{E}[F_n(x)] = \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}[I{X_i \le x}] = \frac{1}{n} \sum_{i=1}^{n} F(x) = F(x).$

The variance of $F_n(x)$ is obtained using the independence of the indicators:

$$ \begin{aligned} \operatorname{Var}[F_n(x)] &= \operatorname{Var}\left(\frac{1}{n} \sum_{i=1}^{n} I{X_i \le x}\right) \ &= \frac{1}{n^2} \sum_{i=1}^{n} \operatorname{Var}[I{X_i \le x}] \ &= \frac{1}{n^2} \sum_{i=1}^{n} F(x),[1-F(x)] \ &= \frac{F(x),[1-F(x)]}{n}. \end{aligned} $$

Hence the standard deviation of $F_n(x)$ is

$\sigma[F_n(x)] = \sqrt{\frac{F(x),[1-F(x)]}{n}}.$

This completes the proof.

Summary of results

$$ \Pr(F_n(x) = s/n) = \binom{n}{s} [F(x)]^s [1-F(x)]^{n-s}, \quad \mathbb{E}[F_n(x)] = F(x), \quad \sigma[F_n(x)] = \sqrt{\frac{F(x)[1-F(x)]}{n}}. $$

$$ \boxed{\text{Probability, mean, and standard deviation as above.}} $$