Project Euler Problem 697
Given a fixed real number c, define a random sequence (Xn){nge 0} by the following random process: - X0 = c (with probab
Solution
Answer: 4343871.06
Let
$$X_n = c\prod_{k=1}^n U_k,$$
where the $U_k$ are i.i.d. uniform random variables on $(0,1)$.
We want $c$ such that
$$\Pr(X_n<1)=0.25$$
for $n=10,000,000$.
Step 1: Transform the product into a sum
Take logarithms:
$$\ln X_n=\ln c+\sum_{k=1}^n \ln U_k.$$
Since $0<U_k<1$, define
$$Y_k=-\ln U_k.$$
A standard fact is:
$$Y_k \sim \text{Exp}(1),$$
because
$$\Pr(Y_k\le y)=\Pr(-\ln U_k\le y) =\Pr(U_k\ge e^{-y}) =1-e^{-y}.$$
Thus
$$-\sum_{k=1}^n \ln U_k = \sum_{k=1}^n Y_k.$$
So if we define
$$S_n=\sum_{k=1}^n Y_k,$$
then
$$S_n \sim \Gamma(n,1),$$
with mean $n$ and variance $n$.
Also,
$$X_n<1 \iff \ln c - S_n <0 \iff S_n>\ln c.$$
Hence the condition becomes
$$\Pr(S_n>\ln c)=0.25.$$
Equivalently,
$$\Pr(S_n\le \ln c)=0.75.$$
Therefore, $\ln c$ is the 75th percentile of the distribution $\Gamma(n,1)$.
Step 2: Approximate the gamma quantile
For very large $n$,
$$\Gamma(n,1)\approx N(n,n)$$
by the Central Limit Theorem.
Let
$$z=\Phi^{-1}(0.75)=0.674489750196\ldots$$
Then the Cornish–Fisher expansion gives an accurate approximation for the 75th percentile:
$$q \approx n + z\sqrt n + \frac{z^2-1}{3}.$$
With
$$n=10,000,000, \qquad \sqrt n = 3162.277660\ldots$$
we get
$$q \approx 10,000,000 + 0.674489750196(3162.277660) + \frac{0.674489750196^2-1}{3}.$$
Numerically,
$$q \approx 10,000,002.411477.$$
Since $q=\ln c$,
$$\log_{10} c = \frac{\ln c}{\ln 10} = \frac{q}{\ln 10}.$$
Thus
$$\log_{10} c \approx 4343871.057176\ldots$$
which rounds to
$$4343871.06.$$
Python implementation
import math
from statistics import NormalDist
# Problem parameter
n = 10_000_000
# 75th percentile of the standard normal distribution
z = NormalDist().inv_cdf(0.75)
# Cornish-Fisher approximation for Gamma(n,1) quantile
q = n + z * math.sqrt(n) + (z*z - 1) / 3
# Convert ln(c) to log10(c)
answer = q / math.log(10)
print(round(answer, 2))
Code walkthrough
1. Compute the normal quantile
z = NormalDist().inv_cdf(0.75)
This gives
$$z=\Phi^{-1}(0.75)\approx0.67448975.$$
2. Approximate the gamma quantile
q = n + z * math.sqrt(n) + (z*z - 1) / 3
This is the Cornish–Fisher correction to the normal approximation for a gamma distribution.
3. Convert to base-10 logarithm
answer = q / math.log(10)
Because
$$\log_{10} c = \frac{\ln c}{\ln 10}.$$
Verification
For the smaller example $n=100$,
$$\log_{10} c \approx 46.27,$$
matching the statement in the problem.
For $n=10^7$, the result should be near
$$\frac{10^7}{\ln 10}\approx 4.3429448\times10^6,$$
and our answer differs only by the expected fluctuation size $O(\sqrt n)$, so the magnitude is correct.
Answer: 4343871.06