Project Euler Problem 187

A composite is a number containing at least two prime factors.

Project Euler Problem 187

Solution

Answer: 17427258

We are counting the integers

$$n < 10^8$$

that have exactly two prime factors, counted with multiplicity.

Such numbers are called semiprimes.

Examples:

  • $15 = 3\cdot 5$
  • $9 = 3^2$
  • $12 = 2^2\cdot 3$ → not allowed (three prime factors counting multiplicity)

So we want all numbers of the form

$$n = pq$$

where $p,q$ are primes and $p \le q$.


Mathematical analysis

We need to count pairs of primes $(p,q)$ such that

$$pq < 10^8.$$

To avoid double-counting, enforce

$$p \le q.$$

Thus for each prime $p$, we count primes $q$ satisfying

$$q \ge p,\qquad q < \frac{10^8}{p}.$$


Key observation

If $p > \sqrt{10^8} = 10^4$, then

$$p^2 > 10^8,$$

so no valid $q \ge p$ exists.

Therefore we only need primes

$$p \le 10^4.$$

For each such prime $p$:

  1. Let

$$M = \left\lfloor \frac{10^8-1}{p} \right\rfloor.$$ 2. Count primes $q$ with

$$p \le q \le M.$$

If $\pi(x)$ denotes the number of primes $\le x$, and if $p$ is the $i$-th prime (0-indexed), then the number of admissible $q$ is

$$\pi(M) - i.$$

Summing over all valid $p$ gives the answer.


Efficient algorithm

We need:

  1. All primes up to $10^8 / 2 = 5\times10^7$.
  2. Fast prime counting $\pi(x)$.

A sieve of Eratosthenes up to $5\times10^7$ is feasible in optimized Python/PyPy, but conceptually the method is straightforward.

Pseudo-process:

  • Generate all primes up to $5\times10^7$.

  • For each prime $p \le 10^4$:

  • Find how many primes are $\le (10^8-1)/p$.

  • Subtract the index of $p$ to enforce $q \ge p$.

Complexity is dominated by the sieve.


Python implementation

from bisect import bisect_right
from math import isqrt

LIMIT = 10**8

# We only need primes up to LIMIT // 2
MAXP = LIMIT // 2

# Sieve of Eratosthenes
sieve = bytearray(b"\x01") * (MAXP + 1)
sieve[0:2] = b"\x00\x00"

for i in range(2, isqrt(MAXP) + 1):
    if sieve[i]:
        step = i
        start = i * i
        sieve[start:MAXP + 1:step] = b"\x00" * (((MAXP - start) // step) + 1)

# Extract primes
primes = [i for i in range(2, MAXP + 1) if sieve[i]]

answer = 0

# Only need p <= sqrt(LIMIT)
root = isqrt(LIMIT)

for i, p in enumerate(primes):
    if p > root:
        break

    max_q = (LIMIT - 1) // p

    # Number of primes <= max_q
    count = bisect_right(primes, max_q)

    # Enforce q >= p
    answer += count - i

print(answer)

Code walkthrough

1. Define the limit

LIMIT = 10**8

We seek semiprimes below $10^8$.


2. Build the sieve

MAXP = LIMIT // 2

If $p,q \ge 2$, then one factor cannot exceed $10^8/2$.

The sieve marks primes up to $5\times10^7$.


3. Standard sieve of Eratosthenes

for i in range(2, isqrt(MAXP) + 1):

For each prime $i$, mark multiples composite beginning at $i^2$.


4. Extract all primes

primes = [i for i in range(2, MAXP + 1) if sieve[i]]

Now primes is an ordered list of all primes up to $5\times10^7$.


5. Iterate over the smaller factor

for i, p in enumerate(primes):
    if p > root:
        break

Only primes $p \le 10^4$ matter.


6. Largest valid partner prime

max_q = (LIMIT - 1) // p

We need

$$pq < 10^8.$$


7. Count eligible $q$

count = bisect_right(primes, max_q)

This gives the number of primes $\le max_q$.

Subtract i because we require $q \ge p$.

answer += count - i

Check against the small example

Below 30, semiprimes are:

$$4,6,9,10,14,15,21,22,25,26$$

Exactly 10 numbers.

The algorithm reproduces this count for LIMIT = 30.


Final verification

The count should be large:

  • Density of semiprimes up to $x$ is roughly

$$\frac{x \log\log x}{\log x},$$

so for $10^8$ a value around $10^7$ is plausible.

Running the computation gives:

$$17427258$$

which matches the known Project Euler result.

Answer: 17427258