TAOCP 3.4.1 Exercise 5
We are asked to generate a random variable $X$ with distribution function F(x) = p x + q x^2 + r x^3, \qquad 0 \le x \le 1, where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.
Section 3.4.1: Numerical Distributions
Exercise 5. ▶ [21] Suggest an efficient way to compute a random variable with the distribution $F(x) = px + qx^2 + rx^3$, where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.
Verified: yes
Solve time: 2m54s
Solution
We are asked to generate a random variable $X$ with distribution function
$$ F(x) = p x + q x^2 + r x^3, \qquad 0 \le x \le 1, $$
where $p \ge 0$, $q \ge 0$, $r \ge 0$, and $p + q + r = 1$.
Observe that $F$ can be written as a convex combination of simpler distribution functions:
$$ F(x) = p F_1(x) + q F_2(x) + r F_3(x), $$
where
$$ F_1(x) = x, \qquad F_2(x) = x^2, \qquad F_3(x) = x^3. $$
Each $F_k(x)$ is a valid distribution function on $[0,1]$. Indeed, $F_k(0) = 0$, $F_k(1) = 1$, and $F_k$ is nondecreasing.
This representation suggests a mixture method for generating $X$: first select one of the three components according to the probabilities $p, q, r$, then generate a random variable from that component.
Step 1: Choose the component
Let $U_1$ be a uniform $[0,1]$ random variable. Define an integer $K \in {1,2,3}$ by
$$ K = \begin{cases} 1 & \text{if } U_1 < p,\[2mm] 2 & \text{if } p \le U_1 < p+q,\[1mm] 3 & \text{if } p+q \le U_1 \le 1. \end{cases} $$
Then $P(K=1) = p$, $P(K=2) = q$, and $P(K=3) = r$.
Step 2: Generate the corresponding variable
Let $U_2$ be an independent uniform $[0,1]$ random variable. Define
$$ X = U_2^{1/K}. $$
Then for each fixed $K=k$,
$$ P(X \le x \mid K=k) = P(U_2^{1/k} \le x) = P(U_2 \le x^k) = x^k = F_k(x), \qquad 0 \le x \le 1. $$
By the law of total probability, the unconditional distribution of $X$ is
$$ P(X \le x) = \sum_{k=1}^3 P(K=k) P(X \le x \mid K=k) = p x + q x^2 + r x^3 = F(x), $$
as required.
Step 3: Efficiency
This method is efficient because it requires:
- Generation of two uniform random variables $U_1$ and $U_2$,
- One comparison to select $K$,
- One fractional power computation $U_2^{1/K}$.
No cubic equation solving, no square roots or cube roots beyond $1/2$ or $1/3$, and no evaluation of Cardano's formula is necessary. This is much simpler and faster than the generic inversion method.
Step 4: Special cases
- If $r = 0$, then $K \in {1,2}$ with probabilities $p, q$, and $X = U_2^{1/K}$ reduces accordingly.
- If $r = q = 0$, then $K=1$ with probability 1, so $X = U_2$.
- These cases are automatically handled by the mixture construction.
This construction produces a random variable with the desired distribution $F(x)$ efficiently.
∎