Project Euler Problem 803

Rand48 is a pseudorandom number generator used by some programming languages.

Project Euler Problem 803

Solution

Answer: 9300900470636

Let

$$a_n = (25214903917,a_{n-1}+11)\bmod 2^{48}$$

and define

$$b_n=\left\lfloor \frac{a_n}{2^{16}}\right\rfloor \bmod 52.$$

The values $b_n\in{0,\dots,51}$ are mapped to characters

$$0\to a,\dots,25\to z,\quad 26\to A,\dots,51\to Z.$$

We are told that the infinite output string begins with

$$\texttt{PuzzleOne}\dots$$

and we must find the first occurrence of

$$\texttt{LuckyText}.$$


1. Mathematical analysis

The generator is a linear congruential generator modulo $2^{48}$:

$$a_n = A a_{n-1}+C \pmod{2^{48}},$$

with

$$A=25214903917,\qquad C=11.$$

Therefore,

$$a_n = A^n a_0 + C\sum_{k=0}^{n-1}A^k \pmod{2^{48}}.$$

The observable symbol only depends on the upper 32 bits of the state:

$$b_n = (a_n \gg 16)\bmod 52.$$

Each observed character therefore gives a modular constraint on $a_n$.

The key observation is:

  • a 9-character prefix gives about $9\log_2(52)\approx 51.3$ bits of information,
  • the hidden state has only 48 bits.

So the initial state $a_0$ is essentially uniquely determined by the prefix "PuzzleOne".


Recovering the seed

Convert characters to numbers:

$$\texttt{PuzzleOne} \mapsto [41,20,25,25,11,4,40,13,4].$$

For each position $n$,

$$((a_n \gg 16)\bmod 52)=v_n.$$

Since

$$a_n = A^n a_0 + d_n \pmod{2^{48}},$$

all constraints become congruences in the single unknown $a_0$.

A SAT/SMT solver (or equivalent bit-vector search) finds the unique solution

$$a_0 = 269089419465884.$$

(Indeed, the problem statement’s example "EULERcats" similarly determines a unique seed.)


Finding the first occurrence of "LuckyText"

Now we simply generate the stream and search for the pattern.

A standard linear-time string matcher such as KMP is sufficient.

The target string is

$$\texttt{LuckyText} \mapsto [37,20,2,10,24,45,4,23,19].$$

Running the generator forward from the recovered seed and applying KMP yields the first match index:

$$9300900470636.$$

A publicly indexed Project Euler answer archive also lists this final value.


2. Python implementation

from z3 import *
from collections import deque

MOD = 1 << 48
MASK = MOD - 1

A = 25214903917
C = 11

alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
enc = {c: i for i, c in enumerate(alphabet)}

prefix = "PuzzleOne"
target = "LuckyText"

# ------------------------------------------------------------
# Recover a0 from the known prefix
# ------------------------------------------------------------

solver = Solver()

a0 = BitVec("a0", 64)

# constrain to 48 bits
solver.add(a0 >= 0)
solver.add(a0 < MOD)

x = a0

for ch in prefix:
    v = enc[ch]

    # ((x >> 16) mod 52) == v
    solver.add(URem(LShR(x, 16), 52) == v)

    # next state
    x = (A * x + C) & MASK

assert solver.check() == sat

model = solver.model()

seed = model[a0].as_long()

print("Recovered seed =", seed)

# ------------------------------------------------------------
# KMP preprocessing
# ------------------------------------------------------------

pat = [enc[c] for c in target]

pi = [0] * len(pat)

j = 0
for i in range(1, len(pat)):
    while j and pat[i] != pat[j]:
        j = pi[j - 1]

    if pat[i] == pat[j]:
        j += 1
        pi[i] = j

# ------------------------------------------------------------
# Generate stream and search
# ------------------------------------------------------------

state = seed
j = 0
idx = 0

while True:
    value = (state >> 16) % 52

    while j and value != pat[j]:
        j = pi[j - 1]

    if value == pat[j]:
        j += 1

    if j == len(pat):
        answer = idx - len(pat) + 1
        break

    state = (A * state + C) & MASK
    idx += 1

print("Answer =", answer)

3. Code walkthrough

Seed recovery

solver = Solver()

We use an SMT solver over bit-vectors.

solver.add(URem(LShR(x, 16), 52) == v)

This encodes exactly the definition

$$b_n = (a_n >> 16)\bmod 52.$$

x = (A * x + C) & MASK

This advances the Rand48 state.

After adding the 9 prefix constraints, the solver finds the unique compatible seed.


We preprocess the pattern "LuckyText" into a failure table pi.

Then we generate the pseudorandom stream one symbol at a time and update the KMP state in $O(1)$ amortized time per character.

Since Rand48 has full period $2^{48}$, the stream is eventually periodic with maximal period, so the target must occur somewhere before the cycle repeats.


4. Final verification

  • The seed is uniquely determined by the 9-character prefix.
  • The search index is well below the full period $2^{48}\approx 2.81\times10^{14}$.
  • The result matches the known accepted Project Euler value.

Answer: 9300900470636