TAOCP 3.4.1 Exercise 5

We are asked to generate a random variable $X$ with distribution function F(x) = p x + q x^2 + r x^3, \qquad 0 \le x \le 1, where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.

Section 3.4.1: Numerical Distributions

Exercise 5. ▶ [21] Suggest an efficient way to compute a random variable with the distribution $F(x) = px + qx^2 + rx^3$, where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.

Verified: yes
Solve time: 2m54s


Solution

We are asked to generate a random variable $X$ with distribution function

$$ F(x) = p x + q x^2 + r x^3, \qquad 0 \le x \le 1, $$

where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.

Observe that $F$ can be written as a convex combination of simpler distribution functions:

$$ F(x) = p F_1(x) + q F_2(x) + r F_3(x), $$

where

$$ F_1(x) = x, \qquad F_2(x) = x^2, \qquad F_3(x) = x^3. $$

Each $F_k(x)$ is a valid distribution function on $[0,1]$. Indeed, $F_k(0) = 0$, $F_k(1) = 1$, and $F_k$ is nondecreasing.

This representation suggests a mixture method for generating $X$: first select one of the three components according to the probabilities $p, q, r$, then generate a random variable from that component.

Step 1: Choose the component

Let $U_1$ be a uniform $[0,1]$ random variable. Define an integer $K \in {1,2,3}$ by

$$ K = \begin{cases} 1 & \text{if } U_1 < p,\[2mm] 2 & \text{if } p \le U_1 < p+q,\[1mm] 3 & \text{if } p+q \le U_1 \le 1. \end{cases} $$

Then $P(K=1) = p$, $P(K=2) = q$, and $P(K=3) = r$.

Step 2: Generate the corresponding variable

Let $U_2$ be an independent uniform $[0,1]$ random variable. Define

$$ X = U_2^{1/K}. $$

Then for each fixed $K=k$,

$$ P(X \le x \mid K=k) = P(U_2^{1/k} \le x) = P(U_2 \le x^k) = x^k = F_k(x), \qquad 0 \le x \le 1. $$

By the law of total probability, the unconditional distribution of $X$ is

$$ P(X \le x) = \sum_{k=1}^3 P(K=k) P(X \le x \mid K=k) = p x + q x^2 + r x^3 = F(x), $$

as required.

Step 3: Efficiency

This method is efficient because it requires:

  1. Generation of two uniform random variables $U_1$ and $U_2$,
  2. One comparison to select $K$,
  3. One fractional power computation $U_2^{1/K}$.

No cubic equation solving, no square roots or cube roots beyond $1/2$ or $1/3$, and no evaluation of Cardano's formula is necessary. This is much simpler and faster than the generic inversion method.

Step 4: Special cases

  • If $r = 0$, then $K \in {1,2}$ with probabilities $p, q$, and $X = U_2^{1/K}$ reduces accordingly.
  • If $r = q = 0$, then $K=1$ with probability 1, so $X = U_2$.
  • These cases are automatically handled by the mixture construction.

This construction produces a random variable with the desired distribution $F(x)$ efficiently.