Project Euler Problem 307

k defects are randomly distributed amongst n integrated-circuit chips produced by a factory (any number of defects may b

Project Euler Problem 307

Solution

Answer: 0.7311720251

Let

  • $n = 1{,}000{,}000$ chips,
  • $k = 20{,}000$ defects,

with each defect independently and uniformly assigned to one of the $n$ chips.

We want

$$p(k,n)=\Pr(\text{some chip has at least }3\text{ defects}).$$

Equivalently,

$$p(k,n)=1-\Pr(\text{every chip has at most 2 defects}).$$

The main challenge is computing this probability accurately for large parameters.


1. Mathematical analysis

We distribute $k$ labelled defects into $n$ labelled chips.

Total assignments:

$$n^k.$$

We now count the assignments where no chip receives more than 2 defects.

Suppose:

  • $a$ chips receive exactly 2 defects,
  • $b$ chips receive exactly 1 defect.

Then

$$2a+b=k.$$

Hence

$$b=k-2a.$$

The number of occupied chips is

$$a+b=k-a.$$


Counting valid configurations

For a fixed $a$:

Step 1: choose the chips

Choose:

  • $a$ chips that contain 2 defects,
  • $b=k-2a$ chips that contain 1 defect.

Number of ways:

$$\binom{n}{a}\binom{n-a}{k-2a}.$$


Step 2: distribute the defects

We have $k$ labelled defects.

We must partition them into:

  • $a$ unordered pairs,
  • $k-2a$ singletons.

Number of ways:

$$\frac{k!}{(2!)^a}.$$

But the paired groups are themselves unordered, so divide by $a!$:

$$\frac{k!}{2^a a!}.$$

The singleton groups are automatically distinguished by their chips.

Thus total valid assignments for fixed $a$:

$$\binom{n}{a} \binom{n-a}{k-2a} \frac{k!}{2^a}.$$

Dividing by $n^k$:

$$\Pr(\text{no chip has } \ge 3) = \sum_{a=0}^{\lfloor k/2\rfloor} \frac{ \binom{n}{a} \binom{n-a}{k-2a} k! }{ 2^a n^k }.$$

This simplifies nicely.


Simplified formula

Using falling factorials:

$$\binom{n}{a}\binom{n-a}{k-2a} = \frac{n!}{a!(k-2a)!(n-k+a)!}.$$

Hence

$$P_{\le2} = \sum_{a=0}^{\lfloor k/2\rfloor} \frac{ n! , k! }{ a!(k-2a)!(n-k+a)!2^a n^k }.$$

Therefore

$$p(k,n)=1-P_{\le2}.$$


2. Numerically stable computation

Direct factorials are impossible.

We instead compute each term logarithmically.

Define

$$T_a = \frac{ n! , k! }{ a!(k-2a)!(n-k+a)!2^a n^k }.$$

Then

$$\log T_a = \log(n!) +\log(k!) -\log(a!) -\log((k-2a)!) -\log((n-k+a)!) -a\log 2 -k\log n.$$

Using log-gamma:

$$\log(m!)=\ln\Gamma(m+1).$$

Finally compute the sum using log-sum-exp stabilization.


3. Python implementation

import math

def p(k, n):
    """
    Probability that some chip receives at least 3 defects.
    """

    log_terms = []

    log_n_fact = math.lgamma(n + 1)
    log_k_fact = math.lgamma(k + 1)
    k_log_n = k * math.log(n)

    # Sum over number of double-occupied chips
    for a in range(k // 2 + 1):

        log_term = (
            log_n_fact
            + log_k_fact
            - math.lgamma(a + 1)
            - math.lgamma(k - 2 * a + 1)
            - math.lgamma(n - k + a + 1)
            - a * math.log(2)
            - k_log_n
        )

        log_terms.append(log_term)

    # Stable log-sum-exp
    m = max(log_terms)

    s = sum(math.exp(x - m) for x in log_terms)

    p_le_2 = math.exp(m) * s

    return 1.0 - p_le_2

# Example from problem statement
print(p(3, 7))

# Required value
ans = p(20000, 1000000)

print("{:.10f}".format(ans))

4. Code walkthrough

Precompute constants

log_n_fact = math.lgamma(n + 1)
log_k_fact = math.lgamma(k + 1)

Efficiently computes:

$$\log(n!),\quad \log(k!)$$

without overflow.


Loop over possible numbers of doubled chips

for a in range(k // 2 + 1):

If $a$ chips contain 2 defects, then $k-2a$ chips contain 1 defect.


Compute each logarithmic term

log_term = ...

This is exactly:

$$\log T_a.$$


Log-sum-exp stabilization

Direct exponentiation would underflow.

Instead:

m = max(log_terms)
s = sum(math.exp(x - m) for x in log_terms)

computes the sum safely.


Final probability

return 1.0 - p_le_2

because:

$$p(k,n)=1-\Pr(\text{all chips have }\le2).$$


5. Verification on small example

The problem gives:

$$p(3,7)\approx 0.0204081633.$$

Our code reproduces:

$$0.0204081632653\ldots$$

which matches exactly.


6. Final computation

Running the program for

$$p(20000,1000000)$$

gives:

$$0.7311720251$$

rounded to 10 decimal places.


Answer: 0.7311720251