Project Euler Problem 307
k defects are randomly distributed amongst n integrated-circuit chips produced by a factory (any number of defects may b
Solution
Answer: 0.7311720251
Let
- $n = 1{,}000{,}000$ chips,
- $k = 20{,}000$ defects,
with each defect independently and uniformly assigned to one of the $n$ chips.
We want
$$p(k,n)=\Pr(\text{some chip has at least }3\text{ defects}).$$
Equivalently,
$$p(k,n)=1-\Pr(\text{every chip has at most 2 defects}).$$
The main challenge is computing this probability accurately for large parameters.
1. Mathematical analysis
We distribute $k$ labelled defects into $n$ labelled chips.
Total assignments:
$$n^k.$$
We now count the assignments where no chip receives more than 2 defects.
Suppose:
- $a$ chips receive exactly 2 defects,
- $b$ chips receive exactly 1 defect.
Then
$$2a+b=k.$$
Hence
$$b=k-2a.$$
The number of occupied chips is
$$a+b=k-a.$$
Counting valid configurations
For a fixed $a$:
Step 1: choose the chips
Choose:
- $a$ chips that contain 2 defects,
- $b=k-2a$ chips that contain 1 defect.
Number of ways:
$$\binom{n}{a}\binom{n-a}{k-2a}.$$
Step 2: distribute the defects
We have $k$ labelled defects.
We must partition them into:
- $a$ unordered pairs,
- $k-2a$ singletons.
Number of ways:
$$\frac{k!}{(2!)^a}.$$
But the paired groups are themselves unordered, so divide by $a!$:
$$\frac{k!}{2^a a!}.$$
The singleton groups are automatically distinguished by their chips.
Thus total valid assignments for fixed $a$:
$$\binom{n}{a} \binom{n-a}{k-2a} \frac{k!}{2^a}.$$
Dividing by $n^k$:
$$\Pr(\text{no chip has } \ge 3) = \sum_{a=0}^{\lfloor k/2\rfloor} \frac{ \binom{n}{a} \binom{n-a}{k-2a} k! }{ 2^a n^k }.$$
This simplifies nicely.
Simplified formula
Using falling factorials:
$$\binom{n}{a}\binom{n-a}{k-2a} = \frac{n!}{a!(k-2a)!(n-k+a)!}.$$
Hence
$$P_{\le2} = \sum_{a=0}^{\lfloor k/2\rfloor} \frac{ n! , k! }{ a!(k-2a)!(n-k+a)!2^a n^k }.$$
Therefore
$$p(k,n)=1-P_{\le2}.$$
2. Numerically stable computation
Direct factorials are impossible.
We instead compute each term logarithmically.
Define
$$T_a = \frac{ n! , k! }{ a!(k-2a)!(n-k+a)!2^a n^k }.$$
Then
$$\log T_a = \log(n!) +\log(k!) -\log(a!) -\log((k-2a)!) -\log((n-k+a)!) -a\log 2 -k\log n.$$
Using log-gamma:
$$\log(m!)=\ln\Gamma(m+1).$$
Finally compute the sum using log-sum-exp stabilization.
3. Python implementation
import math
def p(k, n):
"""
Probability that some chip receives at least 3 defects.
"""
log_terms = []
log_n_fact = math.lgamma(n + 1)
log_k_fact = math.lgamma(k + 1)
k_log_n = k * math.log(n)
# Sum over number of double-occupied chips
for a in range(k // 2 + 1):
log_term = (
log_n_fact
+ log_k_fact
- math.lgamma(a + 1)
- math.lgamma(k - 2 * a + 1)
- math.lgamma(n - k + a + 1)
- a * math.log(2)
- k_log_n
)
log_terms.append(log_term)
# Stable log-sum-exp
m = max(log_terms)
s = sum(math.exp(x - m) for x in log_terms)
p_le_2 = math.exp(m) * s
return 1.0 - p_le_2
# Example from problem statement
print(p(3, 7))
# Required value
ans = p(20000, 1000000)
print("{:.10f}".format(ans))
4. Code walkthrough
Precompute constants
log_n_fact = math.lgamma(n + 1)
log_k_fact = math.lgamma(k + 1)
Efficiently computes:
$$\log(n!),\quad \log(k!)$$
without overflow.
Loop over possible numbers of doubled chips
for a in range(k // 2 + 1):
If $a$ chips contain 2 defects, then $k-2a$ chips contain 1 defect.
Compute each logarithmic term
log_term = ...
This is exactly:
$$\log T_a.$$
Log-sum-exp stabilization
Direct exponentiation would underflow.
Instead:
m = max(log_terms)
s = sum(math.exp(x - m) for x in log_terms)
computes the sum safely.
Final probability
return 1.0 - p_le_2
because:
$$p(k,n)=1-\Pr(\text{all chips have }\le2).$$
5. Verification on small example
The problem gives:
$$p(3,7)\approx 0.0204081633.$$
Our code reproduces:
$$0.0204081632653\ldots$$
which matches exactly.
6. Final computation
Running the program for
$$p(20000,1000000)$$
gives:
$$0.7311720251$$
rounded to 10 decimal places.
Answer: 0.7311720251