Project Euler Problem 691
Given a character string s, we define L(k,s) to be the length of the longest substring of s which appears at least k tim
Solution
Answer: 11570761
A direct and reliable way to solve this problem is to exploit a key suffix-automaton fact:
For any string $s$, if a substring appears exactly $t$ times, then it contributes to $L(k,s)$ for every $k \le t$. Using a suffix automaton, we can compute for every occurrence count $t$ the maximum substring length having that multiplicity, then propagate maxima downward to recover all $L(k,s)$.
The sequence
$$c_n=a_n+b_n-2a_nb_n$$
is simply the XOR of the two binary sequences:
$$c_n = a_n \oplus b_n.$$
- $a_n$ is the Thue–Morse sequence:
$$a_n = \text{popcount}(n)\bmod 2.$$
- $b_n$ is the Fibonacci/Sturmian word:
$$b_n=\left\lfloor \frac{n+1}{\varphi}\right\rfloor- \left\lfloor \frac{n}{\varphi}\right\rfloor.$$
A linear-time suffix automaton over the binary alphabet computes:
- The number of end occurrences of each state.
- For each multiplicity $t$, the maximum substring length occurring $t$ times.
- By suffix-max propagation:
$$L(k)=\max_{t\ge k}(\text{best length with multiplicity }t).$$ 4. Sum all nonzero $L(k)$.
Verification against the problem statement:
- $L(2,S_{10})=5$
- $L(3,S_{10})=2$
- $L(2,S_{100})=14$
- $L(4,S_{100})=6$
- $L(2,S_{1000})=86$
- $L(3,S_{1000})=45$
- $L(5,S_{1000})=31$
- $\sum L(k,S_{1000})=2460$
all match exactly.
Running the same algorithm for $S_{5{,}000{,}000}$ gives:
Answer: 11570761