Project Euler Problem 467

An integer s is called a superinteger of another integer n if the digits of n form a subsequenceA subsequence is a seque

Project Euler Problem 467

Solution

Answer: 775181359

Let

  • $A = P^D$: digital roots of the primes,
  • $B = C^D$: digital roots of the composites.

We want the smallest common superinteger of $P_n$ and $C_n$, i.e. the shortest digit string containing both as subsequences, and among equal lengths, the numerically smallest one.

This is exactly the Shortest Common Supersequence (SCS) problem.

Key insight

For two sequences $A,B$:

$$|SCS(A,B)| = |A| + |B| - |LCS(A,B)|$$

where LCS is the Longest Common Subsequence.

To construct the smallest integer, we reconstruct the SCS using dynamic programming:

  • If digits match, take it once.

  • Otherwise:

  • Prefer the branch with larger remaining LCS (shorter total SCS).

  • If tied, choose the smaller digit (to minimize the resulting integer lexicographically).

Since $n=10,000$, an $O(n^2)$ DP is feasible:

$$10^8 \text{ operations}$$

using compact storage for LCS lengths.

Python implementation

from math import isqrt
from array import array

MOD = 1_000_000_007
N = 10_000

def digital_root(x):
    return 1 + (x - 1) % 9

# Generate first N primes and composites (as digital roots)
limit = 200000

while True:
    sieve = bytearray(b"\x01") * (limit + 1)
    sieve[0:2] = b"\x00\x00"

    for p in range(2, isqrt(limit) + 1):
        if sieve[p]:
            sieve[p * p : limit + 1 : p] = (
                b"\x00" * (((limit - p * p) // p) + 1)
            )

    P = []
    C = []

    for x in range(2, limit + 1):
        dr = digital_root(x)

        if sieve[x]:
            if len(P) < N:
                P.append(dr)
        else:
            if len(C) < N:
                C.append(dr)

        if len(P) == N and len(C) == N:
            break

    if len(P) == N and len(C) == N:
        break

    limit *= 2

# LCS DP table
dp = [array('H', [0]) * (N + 1) for _ in range(N + 1)]

for i in range(N - 1, -1, -1):
    row = dp[i]
    next_row = dp[i + 1]
    ai = P[i]

    for j in range(N - 1, -1, -1):
        if ai == C[j]:
            row[j] = next_row[j + 1] + 1
        else:
            row[j] = max(next_row[j], row[j + 1])

# Reconstruct lexicographically smallest SCS modulo MOD
i = j = 0
ans = 0

while i < N or j < N:

    if i == N:
        d = C[j]
        j += 1

    elif j == N:
        d = P[i]
        i += 1

    elif P[i] == C[j]:
        d = P[i]
        i += 1
        j += 1

    else:
        down = dp[i + 1][j]
        right = dp[i][j + 1]

        if down > right:
            d = P[i]
            i += 1

        elif right > down:
            d = C[j]
            j += 1

        else:
            # same minimum length -> smallest digit
            if P[i] <= C[j]:
                d = P[i]
                i += 1
            else:
                d = C[j]
                j += 1

    ans = (ans * 10 + d) % MOD

print(ans)

Verification

The reconstruction reproduces the examples:

  • $f(10)=2357246891352679$
  • $f(100)\bmod 1,000,000,007 = 771661825$

matching the problem statement exactly.

Running the same algorithm for $n=10,000$ gives:

Answer: 775181359