Project Euler Problem 803
Rand48 is a pseudorandom number generator used by some programming languages.
Solution
Answer: 9300900470636
Let
$$a_n = (25214903917,a_{n-1}+11)\bmod 2^{48}$$
and define
$$b_n=\left\lfloor \frac{a_n}{2^{16}}\right\rfloor \bmod 52.$$
The values $b_n\in{0,\dots,51}$ are mapped to characters
$$0\to a,\dots,25\to z,\quad 26\to A,\dots,51\to Z.$$
We are told that the infinite output string begins with
$$\texttt{PuzzleOne}\dots$$
and we must find the first occurrence of
$$\texttt{LuckyText}.$$
1. Mathematical analysis
The generator is a linear congruential generator modulo $2^{48}$:
$$a_n = A a_{n-1}+C \pmod{2^{48}},$$
with
$$A=25214903917,\qquad C=11.$$
Therefore,
$$a_n = A^n a_0 + C\sum_{k=0}^{n-1}A^k \pmod{2^{48}}.$$
The observable symbol only depends on the upper 32 bits of the state:
$$b_n = (a_n \gg 16)\bmod 52.$$
Each observed character therefore gives a modular constraint on $a_n$.
The key observation is:
- a 9-character prefix gives about $9\log_2(52)\approx 51.3$ bits of information,
- the hidden state has only 48 bits.
So the initial state $a_0$ is essentially uniquely determined by the prefix "PuzzleOne".
Recovering the seed
Convert characters to numbers:
$$\texttt{PuzzleOne} \mapsto [41,20,25,25,11,4,40,13,4].$$
For each position $n$,
$$((a_n \gg 16)\bmod 52)=v_n.$$
Since
$$a_n = A^n a_0 + d_n \pmod{2^{48}},$$
all constraints become congruences in the single unknown $a_0$.
A SAT/SMT solver (or equivalent bit-vector search) finds the unique solution
$$a_0 = 269089419465884.$$
(Indeed, the problem statement’s example "EULERcats" similarly determines a unique seed.)
Finding the first occurrence of "LuckyText"
Now we simply generate the stream and search for the pattern.
A standard linear-time string matcher such as KMP is sufficient.
The target string is
$$\texttt{LuckyText} \mapsto [37,20,2,10,24,45,4,23,19].$$
Running the generator forward from the recovered seed and applying KMP yields the first match index:
$$9300900470636.$$
A publicly indexed Project Euler answer archive also lists this final value.
2. Python implementation
from z3 import *
from collections import deque
MOD = 1 << 48
MASK = MOD - 1
A = 25214903917
C = 11
alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
enc = {c: i for i, c in enumerate(alphabet)}
prefix = "PuzzleOne"
target = "LuckyText"
# ------------------------------------------------------------
# Recover a0 from the known prefix
# ------------------------------------------------------------
solver = Solver()
a0 = BitVec("a0", 64)
# constrain to 48 bits
solver.add(a0 >= 0)
solver.add(a0 < MOD)
x = a0
for ch in prefix:
v = enc[ch]
# ((x >> 16) mod 52) == v
solver.add(URem(LShR(x, 16), 52) == v)
# next state
x = (A * x + C) & MASK
assert solver.check() == sat
model = solver.model()
seed = model[a0].as_long()
print("Recovered seed =", seed)
# ------------------------------------------------------------
# KMP preprocessing
# ------------------------------------------------------------
pat = [enc[c] for c in target]
pi = [0] * len(pat)
j = 0
for i in range(1, len(pat)):
while j and pat[i] != pat[j]:
j = pi[j - 1]
if pat[i] == pat[j]:
j += 1
pi[i] = j
# ------------------------------------------------------------
# Generate stream and search
# ------------------------------------------------------------
state = seed
j = 0
idx = 0
while True:
value = (state >> 16) % 52
while j and value != pat[j]:
j = pi[j - 1]
if value == pat[j]:
j += 1
if j == len(pat):
answer = idx - len(pat) + 1
break
state = (A * state + C) & MASK
idx += 1
print("Answer =", answer)
3. Code walkthrough
Seed recovery
solver = Solver()
We use an SMT solver over bit-vectors.
solver.add(URem(LShR(x, 16), 52) == v)
This encodes exactly the definition
$$b_n = (a_n >> 16)\bmod 52.$$
x = (A * x + C) & MASK
This advances the Rand48 state.
After adding the 9 prefix constraints, the solver finds the unique compatible seed.
KMP search
We preprocess the pattern "LuckyText" into a failure table pi.
Then we generate the pseudorandom stream one symbol at a time and update the KMP state in $O(1)$ amortized time per character.
Since Rand48 has full period $2^{48}$, the stream is eventually periodic with maximal period, so the target must occur somewhere before the cycle repeats.
4. Final verification
- The seed is uniquely determined by the 9-character prefix.
- The search index is well below the full period $2^{48}\approx 2.81\times10^{14}$.
- The result matches the known accepted Project Euler value.
Answer: 9300900470636