TAOCP 1.2.10 Exercise 22

Let $X$ be a sum of independent Bernoulli random variables $X_1,X_2,\ldots,X_n$, where $X_k$ takes the value $1$ with probability $p_k$ and $0$ with probability $q_k=1-p_k$.

Section 1.2.10: Analysis of an Algorithm

Exercise 22. [HM22] Suppose $X$ has the generating function $(q_1+p_1z)(q_2+p_2z)\cdots(q_n+p_nz)$, where $p_k+q_k=1$ for $1 \le k \le n$. Let $\mu=EX=p_1+\cdots+p_n$.

a) Prove that

$$ \Pr(X \le \mu r) \le (r^{-r}e^{r-1})^\mu, \qquad 0<r\le 1; $$

$$ \Pr(X \ge \mu r) \le (r^{-r}e^{r-1})^\mu, \qquad r\ge 1. $$

Verified: yes
Solve time: 1m09s


Solution

Let $X$ be a sum of independent Bernoulli random variables $X_1,X_2,\ldots,X_n$, where $X_k$ takes the value $1$ with probability $p_k$ and $0$ with probability $q_k=1-p_k$. Then $X = X_1+\cdots+X_n$ has generating function

$$ G_X(z) = \prod_{k=1}^{n} (q_k + p_k z), $$

as stated in the problem. Let $\mu = EX = \sum_{k=1}^{n} p_k$.

We wish to obtain exponential bounds on the tail probabilities $\Pr(X \ge t)$ and $\Pr(X \le t)$ using the method of generating functions, sometimes called the Chernoff bound technique.

Let $r>0$ be arbitrary. For any nonnegative integer-valued random variable $X$, Markov's inequality gives

$$ \Pr(X \ge t) = \Pr(r^X \ge r^t) \le \frac{E[r^X]}{r^t} = \frac{G_X(r)}{r^t}. \tag{1} $$

Similarly, for $0<r\le 1$, we have

$$ \Pr(X \le t) = \Pr(r^X \ge r^t) \le \frac{G_X(r)}{r^t}. \tag{2} $$

Thus the key is to choose $r$ to minimize the bound.

Step 1. Upper tail bound ($X \ge \mu r$, $r\ge 1$)

By (1) with $t=\mu r$ and $r>1$, we have

$$ \Pr(X \ge \mu r) \le \frac{G_X(r)}{r^{\mu r}} = \frac{\prod_{k=1}^{n} (q_k + p_k r)}{r^{\mu r}}. \tag{3} $$

We now use the inequality $1 + x \le e^x$, valid for all real $x$. For each factor,

$$ q_k + p_k r = 1 - p_k + p_k r = 1 + p_k(r-1) \le e^{p_k(r-1)}. \tag{4} $$

Multiplying over $k=1,\ldots,n$ gives

$$ G_X(r) = \prod_{k=1}^{n} (q_k + p_k r) \le \prod_{k=1}^{n} e^{p_k(r-1)} = e^{\mu (r-1)}. \tag{5} $$

Combining (3) and (5), we find

$$ \Pr(X \ge \mu r) \le \frac{e^{\mu(r-1)}}{r^{\mu r}} = (r^{-r} e^{r-1})^\mu. \tag{6} $$

Step 2. Lower tail bound ($X \le \mu r$, $0<r\le 1$)

For $0<r\le 1$, we use (2) with $t = \mu r$:

$$ \Pr(X \le \mu r) \le \frac{G_X(r)}{r^{\mu r}}. \tag{7} $$

Now $r-1 \le 0$, so $1 + p_k(r-1) \le e^{p_k(r-1)}$ still holds. Therefore, as in Step 1,

$$ G_X(r) = \prod_{k=1}^{n} (1 + p_k(r-1)) \le e^{\sum_{k=1}^{n} p_k (r-1)} = e^{\mu (r-1)}. \tag{8} $$

Substituting into (7) gives

$$ \Pr(X \le \mu r) \le \frac{e^{\mu(r-1)}}{r^{\mu r}} = (r^{-r} e^{r-1})^\mu. \tag{9} $$

Equations (6) and (9) establish the stated inequalities for $r\ge 1$ and $0<r\le 1$, respectively.

This completes the proof.