Project Euler Problem 316
Let p = p1 p2 p3 cdots be an infinite sequence of random digits, selected from 0,1,2,3,4,5,6,7,8,9 with equal probabilit
Solution
Answer: 542934735751917735
Let the target decimal string be $w=w_1w_2\dots w_m$, where $m$ is the number of digits of $n$.
We scan the random digit stream until the first occurrence of $w$.
This is the classical “pattern matching in a random stream” problem, solved using the automaton/KMP overlap structure.
1. Mathematical analysis
For a word $w$, define a border to be a nonempty string that is simultaneously:
- a prefix of $w$,
- and a suffix of $w$.
If a border has length $b$, then after partially matching the pattern and failing, we can reuse that overlap.
A standard result from automata / renewal theory gives:
$$E(w)=\sum_{b \in B(w)} 10^b,$$
where:
- $E(w)$ is the expected number of digits needed until the pattern is completed,
- $B(w)$ is the set of border lengths including the full length $m$.
However, the problem defines $g(n)$ as the expected starting position $k$, not the finishing position.
If the match finishes at position $t$, then
$$k=t-m+1.$$
Hence
$$g(n)=E(w)-m+1.$$
Therefore:
$$\boxed{ g(n)=\sum_{b \in B(w)} 10^b - m + 1 }$$
where $w$ is the decimal expansion of $n$.
Example: $n=535$
Borders of "535":
"5"(length $1$)"535"(length $3$)
Thus
$$g(535)=10^1+10^3-3+1 =10+1000-2 =1008.$$
This matches the statement.
2. Efficient computation
We must evaluate
$$\sum_{n=2}^{999999} g!\left(\left\lfloor\frac{10^{16}}{n}\right\rfloor\right).$$
For each value:
- Convert to decimal string.
- Compute all border lengths using the KMP prefix-function.
- Sum $10^b$ over all borders.
- Subtract $m-1$.
The total work is tiny because each string has at most 16 digits.
3. Python implementation
def g_value(x):
s = str(x)
m = len(s)
# KMP prefix function
pi = [0] * m
for i in range(1, m):
j = pi[i - 1]
while j > 0 and s[i] != s[j]:
j = pi[j - 1]
if s[i] == s[j]:
j += 1
pi[i] = j
# Full word always contributes 10^m
ans = 10 ** m - m + 1
# Add contributions from proper borders
j = pi[-1]
while j > 0:
ans += 10 ** j
j = pi[j - 1]
return ans
total = 0
for n in range(2, 1_000_000):
total += g_value(10 ** 16 // n)
print(total)
4. Code walkthrough
g_value(x)
-
Converts the number into its decimal string.
-
Computes the KMP prefix table:
-
pi[i]= length of the longest proper prefix which is also a suffix fors[:i+1].
Example for "535":
- prefix-function ends with value
1, - corresponding to border
"5".
We then:
- add $10^m$ for the full pattern,
- walk backward through the border chain using the prefix table,
- add $10^b$ for every proper border length $b$,
- subtract $m-1$.
5. Verification on the given test
Running
sum(g_value(10**6 // n) for n in range(2, 1000))
produces
$$27280188,$$
exactly matching the problem statement.
So the method is correct.
The final computation gives:
$$\boxed{542934735751917735}$$
Answer: 542934735751917735