Project Euler Problem 50

The prime 41, can be written as the sum of six consecutive primes: This is the longest sum of consecutive primes that ad

Project Euler Problem 50

Solution

Answer: 997651

1. Mathematical analysis

We want the prime below $10^6$ that can be written as the sum of the most consecutive primes.

For example:

$$41 = 2+3+5+7+11+13$$

which uses 6 consecutive primes.

A brute force approach would be:

  • Generate all primes below one million.
  • Try every consecutive subsequence.
  • Check whether the sum is prime.

But naively this is too slow because there are tens of thousands of primes below $10^6$.

Step 1: Generate all primes efficiently

We use the Sieve of Eratosthenes to generate every prime below one million.

There are only:

$$\pi(10^6)=78498$$

primes below one million, which is manageable.

We also store them in a set for $O(1)$ primality lookup.


Step 2: Consecutive sums using prefix sums

Suppose the primes are:

$$p_0,p_1,p_2,\dots$$

Define a prefix sum array:

$$S_k = \sum_{i=0}^{k-1} p_i$$

with:

$$S_0=0$$

Then any consecutive prime sum becomes:

$$p_i+p_{i+1}+\cdots+p_{j-1} = S_j-S_i$$

This is the key optimization.

Instead of recomputing sums repeatedly, we can obtain any interval sum in constant time.


Step 3: Search for the longest valid chain

We seek the largest length $L$ such that:

$$S_j - S_i$$

is:

  1. Less than $10^6$
  2. Prime

for some $j-i=L$.

Optimization ideas:

(a) Only check longer sequences than current best

If we already found a sequence of length 100, there is no point checking lengths below 100.

(b) Stop when sums exceed one million

Because primes are increasing, once a sum exceeds the limit, extending the interval only makes it larger.

This keeps the search efficient.


2. Python implementation

LIMIT = 1_000_000

# -----------------------------
# Sieve of Eratosthenes
# -----------------------------
is_prime = [True] * LIMIT
is_prime[0] = is_prime[1] = False

for i in range(2, int(LIMIT ** 0.5) + 1):
    if is_prime[i]:
        for multiple in range(i * i, LIMIT, i):
            is_prime[multiple] = False

# List of primes
primes = [i for i, prime in enumerate(is_prime) if prime]

# Fast primality lookup
prime_set = set(primes)

# -----------------------------
# Prefix sums
# prefix[i] = sum of first i primes
# -----------------------------
prefix = [0]

for p in primes:
    prefix.append(prefix[-1] + p)

# -----------------------------
# Find longest consecutive sum
# -----------------------------
best_length = 0
best_prime = 0

n = len(primes)

for start in range(n):
    # No need to check lengths shorter than current best
    for end in range(start + best_length + 1, n + 1):

        current_sum = prefix[end] - prefix[start]

        # Stop if over limit
        if current_sum >= LIMIT:
            break

        # Check if prime
        if current_sum in prime_set:
            best_length = end - start
            best_prime = current_sum

print(best_prime)

3. Code walkthrough

Sieve construction

is_prime = [True] * LIMIT

We create a boolean array.

Then eliminate composite numbers:

for i in range(2, int(LIMIT ** 0.5) + 1):

We only need to sieve up to $\sqrt{10^6}$.

For each prime $i$, mark multiples as composite.


Prime list

primes = [i for i, prime in enumerate(is_prime) if prime]

This gives all primes below one million.

We also create:

prime_set = set(primes)

for fast membership testing.


Prefix sums

We define:

prefix = [0]

Then accumulate:

prefix.append(prefix[-1] + p)

Example:

For primes:

$$2,3,5,7,11$$

prefix becomes:

$$[0,2,5,10,17,28]$$

So:

$$3+5+7 = 17-2 = 15$$

using:

prefix[4] - prefix[1]

Searching for the longest sequence

For every starting position:

for start in range(n):

we try ending positions.

The crucial optimization:

start + best_length + 1

means we only test sequences longer than the current best.

We compute the sum instantly:

current_sum = prefix[end] - prefix[start]

If:

current_sum >= LIMIT

we stop immediately because longer intervals only increase the sum.

If the sum is prime:

if current_sum in prime_set:

we update the best result.


Check against examples

Below 100

The algorithm finds:

$$41$$

with 6 terms:

$$2+3+5+7+11+13$$

Correct.

Below 1000

It finds:

$$953$$

with 21 terms.

Also matches the problem statement.


4. Final verification

The result should:

  • Be a prime below $10^6$ ✓
  • Be expressible as a long consecutive prime sum ✓
  • Exceed the smaller examples substantially ✓
  • Match the known Project Euler benchmark ✓

The longest chain below one million produces:

Answer: 997651