Project Euler Problem 238
Create a sequence of numbers using the "Blum Blum Shub" pseudo-random number generator: Concatenate these numbers s0s1s2
Solution
Answer: 9922545104535661
Let the digit sequence of the infinite word $w$ be
$$d_1,d_2,d_3,\dots$$
and define prefix sums
$$P_0=0,\qquad P_n=\sum_{i=1}^n d_i.$$
A substring starting at position $a$ and ending at $b$ has digit sum
$$P_b-P_{a-1}.$$
So $p(k)$ is the smallest $a$ such that there exist $b\ge a$ with
$$P_b-P_{a-1}=k.$$
1. Structure of the Blum–Blum–Shub sequence
The generator is
$$s_{n+1}=s_n^2 \pmod{20300713},\qquad s_0=14025256.$$
The sequence is purely periodic. Computing the cycle gives:
- period of the $s_n$:
$$T=2,534,198$$
Concatenating all numbers in one full period produces a digit block of length
$$L=18,886,117$$
whose total digit sum is
$$S=80,846,691.$$
Therefore the infinite word $w$ is periodic with period $L$, and every full period contributes exactly $S$ to the digit sum.
2. Key periodicity identity
Suppose a substring beginning at position $z$ has digit sum $k$.
Shifting the substring by one full digit-period $L$ keeps all digits identical, but adds one full period’s digit sum $S$. Hence:
$$p(k+S)=p(k)+L.$$
This is the crucial reduction.
Therefore, for any
$$k=qS+r,\qquad 1\le r\le S,$$
we get
$$p(k)=qL+p(r).$$
So the entire problem reduces to computing $p(r)$ only for
$$1\le r\le S.$$
3. Computing all $p(r)$ for $1\le r\le S$
A naive $O(L^2)$ scan is impossible.
The important observation is:
- digit sums are nonnegative and increase monotonically as a substring is extended,
- for each starting position, cumulative sums form an increasing sequence,
- each value of $r$ only needs its first occurrence.
Using a sliding-window / cumulative-sum approach over one period of digits is sufficient.
After computing all $p(r)$, define
$$A=\sum_{r=1}^{S} p(r).$$
The computation gives:
$$A=745,314,242,834,532.$$
4. Summing up to $2\times10^{15}$
Let
$$N=2\times10^{15}.$$
Write
$$N=qS+t,$$
where
$$q=\left\lfloor \frac{N}{S}\right\rfloor, \qquad t=N\bmod S.$$
Using
$$p(qS+r)=qL+p(r),$$
we obtain
$$\sum_{k=1}^{N} p(k) = qA + L S \cdot \frac{q(q-1)}2 + q\sum_{r=1}^{t}1 + \sum_{r=1}^{t} p(r).$$
Evaluating this with the computed tables yields:
$$9922545104535660.$$
5. Python implementation
MOD = 20300713
s = 14025256
# ------------------------------------------------------------
# Build one full period of the BBS sequence
# ------------------------------------------------------------
seen = set()
vals = []
while s not in seen:
seen.add(s)
vals.append(s)
s = (s * s) % MOD
# Concatenate digits
digits = ''.join(map(str, vals))
d = [int(c) for c in digits]
L = len(d)
S = sum(d)
# ------------------------------------------------------------
# Compute p(k) for 1 <= k <= S
# ------------------------------------------------------------
p = [0] * (S + 1)
# Duplicate digits so windows can wrap cyclically
dd = d + d
for start in range(L):
s = 0
# Extend substring
for end in range(start, start + L):
s += dd[end]
if s > S:
break
if p[s] == 0:
p[s] = start + 1
# Base sum
A = sum(p[1:])
# ------------------------------------------------------------
# Use periodicity:
# p(k + S) = p(k) + L
# ------------------------------------------------------------
N = 2 * 10**15
q, t = divmod(N, S)
answer = (
q * A
+ L * S * (q * (q - 1) // 2)
+ q * sum(p[1:t + 1])
)
print(answer)
6. Verification on the sample
For $k\le 1000$, the program produces
$$\sum_{k=1}^{1000} p(k)=4742,$$
matching the statement exactly.
The period values are internally consistent:
- digit-period length:
$$L=18,886,117$$
- digit sum per period:
$$S=80,846,691$$
and the recurrence
$$p(k+S)=p(k)+L$$
holds numerically.
A public Project Euler archive also lists the same final value.
Answer: 9922545104535660