Project Euler Problem 874

Let p(t) denote the (t+1)th prime number.

Project Euler Problem 874

Solution

Answer: 4992775389

Let

$$p(t)=\text{the }(t+1)\text{th prime},$$

so

$$p(0)=2,\quad p(1)=3,\quad \dots$$

We must compute

$$M(7000,p(7000)).$$


1. Mathematical analysis

We are given:

  • $k=7000$,
  • $n=p(7000)$.

Since $p(t)$ is the $(t+1)$-th prime,

$$n=p(7000)=\text{7001st prime}=70663.$$

Also,

$$p(6999)=70657.$$

We seek the maximum possible value of

$$\sum_{i=1}^n p(a_i)$$

subject to:

  • $0\le a_i<7000$,
  • $\sum a_i\equiv 0\pmod{7000}$.

Step 1: Start from the obvious maximum

Without the divisibility condition, the best choice is

$$a_i=6999 \quad \forall i,$$

giving score

$$70663\cdot 70657.$$

But then

$$\sum a_i = 70663\cdot 6999.$$

Since $6999\equiv -1\pmod{7000}$,

$$70663\cdot 6999 \equiv -70663 \equiv 6337 \pmod{7000}.$$

So we must reduce the total sum by an amount congruent to $6337\pmod{7000}$.


Step 2: Convert to a minimization problem

Define deficits

$$d_i = 6999-a_i.$$

Then $d_i\ge 0$, and

$$\sum d_i \equiv 6337 \pmod{7000}.$$

The score becomes

$$\sum p(a_i) = n,p(6999)-\sum \bigl(p(6999)-p(6999-d_i)\bigr).$$

Define the penalty function

$$f(d)=p(6999)-p(6999-d).$$

Then maximizing the score is equivalent to minimizing

$$\sum f(d_i)$$

subject to

$$\sum d_i \equiv 6337 \pmod{7000}.$$

Because every prime gap is at least $2$,

$$f(d)\ge 2d.$$

Adding another full $7000$ to the deficit would increase the penalty by at least $14000$, so the optimal total deficit is the smallest admissible one:

$$\sum d_i = 6337.$$

Thus we only need:

Minimize $\sum f(d_i)$ subject to $\sum d_i=6337$.


Step 3: Compute the small penalties

The relevant primes near the top are:

$$\begin{aligned} p(6999)&=70657,\ p(6998)&=70639,\ p(6997)&=70627,\ p(6996)&=70621,\ p(6995)&=70619. \end{aligned}$$

Hence:

$$\begin{aligned} f(1)&=18,\ f(2)&=30,\ f(3)&=36,\ f(4)&=38. \end{aligned}$$

The crucial observation is:

$$\frac{f(4)}4=\frac{38}4=9.5$$

is the best average cost per unit deficit.

A computer check for all $1\le d<7000$ shows:

$$f(d)\ge \frac{19d}{2},$$

with equality only when $d$ is a multiple of $4$.

Since

$$6337\equiv 1\pmod 4,$$

we cannot use only $4$'s.

The best residue-$1$ option is

$$f(9)=86,$$

because

$$86-\frac{19\cdot 9}{2}=0.5$$

is minimal among all $d\equiv 1\pmod 4$.

Thus the optimum decomposition is:

$$6337 = 1582\cdot 4 + 9.$$

Total penalty:

$$1582\cdot 38 + 86 = 60116+86 = 60202.$$


Step 4: Final score

Therefore

$$M = 70663\cdot 70657 - 60202.$$

Compute:

$$70663\cdot 70657 = 4992835591.$$

Hence

$$M = 4992835591 - 60202 = 4992775389.$$


2. Python implementation

# Project Euler 874

def sieve(limit):
    """Return list of all primes <= limit."""
    is_prime = [True] * (limit + 1)
    is_prime[0] = is_prime[1] = False

    p = 2
    while p * p <= limit:
        if is_prime[p]:
            for x in range(p * p, limit + 1, p):
                is_prime[x] = False
        p += 1

    return [i for i, v in enumerate(is_prime) if v]

# We need primes up to around the 7001st prime (~70663)
primes = sieve(80000)

# p(t) = (t+1)th prime
def p(t):
    return primes[t]

k = 7000
n = p(7000)       # 7001st prime
top = p(6999)     # largest allowed prime score term

# Required deficit
D = (-n) % k

# Penalty function
# f(d) = p(6999) - p(6999-d)
f = [0] * (k)

for d in range(1, k):
    f[d] = top - p(6999 - d)

# Unbounded knapsack DP:
# dp[s] = minimum penalty to obtain total deficit s
INF = 10**18
dp = [INF] * (D + 1)
dp[0] = 0

for s in range(1, D + 1):
    best = INF
    for d in range(1, s + 1):
        best = min(best, dp[s - d] + f[d])
    dp[s] = best

penalty = dp[D]

answer = n * top - penalty

print(answer)

3. Code walkthrough

Prime generation

primes = sieve(80000)

Generates enough primes so that the 7001st prime is included.


Prime indexing

def p(t):
    return primes[t]

Because Python lists are 0-indexed, primes[t] is exactly the $(t+1)$-th prime.


Required deficit

D = (-n) % k

Since we initially set all $a_i=6999$,

$$\sum a_i \equiv -n \pmod{k}.$$

So we must subtract exactly $D$.


Penalty table

f[d] = top - p(6999 - d)

If we reduce an index by $d$, this is the amount the prime score decreases.


Dynamic programming

dp[s]

stores the minimum penalty to achieve total deficit $s$.

Transition:

dp[s] = min(dp[s-d] + f[d])

because we may add one more block of deficit $d$.


4. Small-example verification

For the example $M(2,5)$:

  • allowed indices are $0,1$,
  • primes are $2,3$,
  • best list is $[0,1,1,1,1]$,
  • sum of indices $=4$, divisible by $2$,
  • score $=2+3+3+3+3=14$.

Matches the statement.


5. Final verification

Our optimal decomposition:

$$6337 = 1582\cdot 4 + 9$$

uses:

  • 1582 variables reduced by $4$,
  • 1 variable reduced by $9$,
  • all remaining variables unchanged.

Penalty:

$$1582\cdot 38 + 86 = 60202.$$

Final score:

$$70663\cdot 70657 - 60202 = 4992775389.$$

Everything is consistent.


Answer: 4992775389