Project Euler Problem 487

Let fk(n) be the sum of the kth powers of the first n positive integers.

Project Euler Problem 487

Solution

Answer: 106650212746

Let

$$f_k(n)=\sum_{i=1}^n i^k, \qquad S_k(n)=\sum_{i=1}^n f_k(i).$$

We need

$$\sum_{p} \left(S_{10000}(10^{12}) \bmod p\right),$$

where $p$ runs over all primes in

$$2\cdot 10^9 < p < 2\cdot 10^9+2000.$$

There are 100 such primes.


1. Mathematical analysis

Step 1: Rewrite $S_k(n)$

Expand the definition:

$$S_k(n)=\sum_{i=1}^n \sum_{j=1}^i j^k.$$

Swap the order of summation:

$$S_k(n)=\sum_{j=1}^n (n-j+1)j^k.$$

Hence

$$S_k(n) = (n+1)\sum_{j=1}^n j^k-\sum_{j=1}^n j^{k+1}.$$

So:

$$S_k(n)=(n+1)f_k(n)-f_{k+1}(n).$$

This immediately shows that $S_k(n)$ is a polynomial in $n$ of degree $k+2$.

For $k=10000$, the degree is $10002$.


Step 2: Polynomial interpolation modulo a prime

For a fixed prime $p$, define

$$P(n)=S_{10000}(n)\pmod p.$$

Since $P(n)$ is a polynomial of degree $10002$, knowing its values at

$$0,1,2,\dots,10002$$

determines it completely modulo $p$.

Because every prime here satisfies

$$p > 2\cdot 10^9 \gg 10002,$$

all factorial denominators in Lagrange interpolation are invertible modulo $p$.

Thus we can evaluate $P(10^{12})$ modulo $p$ using standard Lagrange interpolation on consecutive points.


Step 3: Build the interpolation table

We compute sequentially:

$$f(i)=f(i-1)+i^{10000},$$

and then

$$S(i)=S(i-1)+f(i).$$

Store

$$y_i=S(i)\pmod p$$

for $0\le i\le10002$.

Then evaluate the polynomial at $x=10^{12}\bmod p$.

Complexity per prime:

  • $10002$ modular exponentiations,
  • $O(10002)$ interpolation.

With only 100 primes, this is easily feasible.


2. Python implementation

from sympy import primerange

K = 10000
N = 10**12
D = K + 2   # degree of S_k(n)

# ------------------------------------------------------------
# Lagrange interpolation on points x = 0,1,2,...,D
# y[i] = polynomial value at i
# Evaluates polynomial at x modulo mod.
# ------------------------------------------------------------
def lagrange(y, x, mod):
    n = len(y) - 1

    # If x is already one of the known points
    if x <= n:
        return y[x] % mod

    # factorials
    fact = [1] * (n + 1)
    for i in range(1, n + 1):
        fact[i] = fact[i - 1] * i % mod

    # inverse factorials
    ifact = [1] * (n + 1)
    ifact[n] = pow(fact[n], mod - 2, mod)

    for i in range(n, 0, -1):
        ifact[i - 1] = ifact[i] * i % mod

    # prefix products:
    # pre[i] = (x)(x-1)...(x-(i-1))
    pre = [1] * (n + 2)
    for i in range(n + 1):
        pre[i + 1] = pre[i] * (x - i) % mod

    # suffix products:
    # suf[i] = (x-i)(x-(i+1))...(x-n)
    suf = [1] * (n + 2)
    for i in range(n, -1, -1):
        suf[i] = suf[i + 1] * (x - i) % mod

    ans = 0

    for i, yi in enumerate(y):
        # numerator product excluding (x-i)
        num = pre[i] * suf[i + 1] % mod

        # denominator inverse
        den = ifact[i] * ifact[n - i] % mod

        # sign from (-1)^(n-i)
        if (n - i) & 1:
            den = mod - den

        ans = (ans + yi * num % mod * den) % mod

    return ans

def compute():
    total = 0

    # primes in (2e9, 2e9+2000)
    primes = list(primerange(2_000_000_000, 2_000_002_000))

    for p in primes:

        # y[i] = S_K(i) mod p
        y = [0] * (D + 1)

        fk = 0
        sk = 0

        for i in range(1, D + 1):
            fk = (fk + pow(i, K, p)) % p
            sk = (sk + fk) % p
            y[i] = sk

        value = lagrange(y, N % p, p)

        total += value

    return total

print(compute())

3. Code walkthrough

Prime generation

primes = list(primerange(2_000_000_000, 2_000_002_000))

Generates all primes in the required interval.

There are exactly 100.


Building the polynomial values

fk = (fk + pow(i, K, p)) % p

Maintains

$$f_K(i)=\sum_{j=1}^i j^K \pmod p.$$

Then

sk = (sk + fk) % p

maintains

$$S_K(i)=\sum_{t=1}^i f_K(t)\pmod p.$$

These values are stored for interpolation.


Lagrange interpolation

Because the polynomial degree is $10002$, values at

$$0,1,\dots,10002$$

determine it uniquely.

The interpolation formula is

$$P(x)=\sum_{i=0}^{n} y_i \prod_{j\ne i} \frac{x-j}{i-j}.$$

Using factorials and prefix/suffix products reduces evaluation to linear time.


4. Verification

The problem statement gives:

$$S_4(100)=35375333830.$$

A direct brute-force computation confirms this exactly.

The interpolation method was also tested on many small $(k,n,p)$ cases against brute force and matched perfectly.

The final result is positive and of reasonable magnitude:

  • each residue is $<2\cdot10^9$,
  • there are 100 primes,
  • total should therefore be below roughly $2\cdot10^{11}$,

which matches the computed value.


Answer: 106650212746