Project Euler Problem 458
Consider the alphabet A made out of the letters of the word "text{project}": A=text c,text e,text j,text o,text p,text r
Solution
Answer: 423341841
Let the alphabet be the 7 distinct letters of project:
$$A={c,e,j,o,p,r,t}$$
A forbidden substring is any permutation of “project”, i.e. any length-7 block containing all 7 letters exactly once.
1. Mathematical analysis
Key observation
Since the alphabet itself has exactly 7 letters, a length-7 substring is forbidden iff all 7 letters are distinct.
So the problem becomes:
Count strings of length $n$ over a 7-letter alphabet such that no consecutive block of length 7 has all distinct letters.
State compression
The only information needed to determine whether appending a letter creates a forbidden block is:
the length of the longest suffix of distinct consecutive letters.
Define state $k$ ($0\le k\le 6$) as:
- $k=0$: empty string,
- $k\ge1$: the current string ends with a suffix of $k$ distinct letters.
We never allow state 7 because that would mean a forbidden permutation appeared.
Transition rule
Suppose we are in state $k\ge1$, with suffix
$$x_1x_2\cdots x_k$$
(all distinct).
If we append:
- a new letter not in the suffix:
suffix length becomes $k+1$, with $7-k$ choices.
- one of the existing suffix letters $x_i$:
the new distinct suffix becomes
$$x_{i+1}\cdots x_kx_i$$
whose length is
$$k-i+1.$$
Thus from state $k$:
- exactly one transition to each state $1,2,\dots,k$,
- and $7-k$ transitions to state $k+1$ (if $k<6$).
State $6$ cannot transition to $7$, because that would create a forbidden substring.
This yields a $7\times7$ transition matrix, so
$$T(n)$$
can be computed using fast matrix exponentiation in
$$O(7^3\log n).$$
A quick check:
$$T(7)=7^7-7!=818503,$$
matching the problem statement.
2. Python implementation
MOD = 10**9
SIZE = 7
# Build transition matrix
M = [[0] * SIZE for _ in range(SIZE)]
for k in range(SIZE):
if k == 0:
# First character: 7 choices -> state 1
M[1][0] = 7
else:
# Repeating one of the suffix letters
for new_len in range(1, k + 1):
M[new_len][k] += 1
# Add a new distinct letter
if k < 6:
M[k + 1][k] += (7 - k)
def mat_mul(A, B):
"""Matrix multiplication mod MOD."""
n = len(A)
C = [[0] * n for _ in range(n)]
for i in range(n):
for k in range(n):
if A[i][k]:
a = A[i][k]
for j in range(n):
if B[k][j]:
C[i][j] = (C[i][j] + a * B[k][j]) % MOD
return C
def mat_pow(M, exp):
"""Binary exponentiation of matrix."""
n = len(M)
R = [[1 if i == j else 0 for j in range(n)] for i in range(n)]
while exp:
if exp & 1:
R = mat_mul(R, M)
M = mat_mul(M, M)
exp >>= 1
return R
def T(n):
# Initial vector: empty string in state 0
v = [0] * SIZE
v[0] = 1
P = mat_pow(M, n)
result = [0] * SIZE
for i in range(SIZE):
total = 0
for j in range(SIZE):
total += P[i][j] * v[j]
result[i] = total % MOD
return sum(result) % MOD
# Verification
print(T(7)) # 818503
print(T(10**12)) # final answer
3. Code walkthrough
- Build transition matrix
Encodes how the distinct suffix length changes after appending a character.
2. Matrix multiplication (mat_mul)
Multiplies matrices modulo $10^9$, since only the last 9 digits are needed.
3. Fast exponentiation (mat_pow)
Computes
$$M^n$$
in logarithmic time.
- Initial vector
Starts in state 0 (empty string). 2. Final sum
Sum all valid ending states to get $T(n)$.
Small example check
For $n=7$:
T(7) = 818503
which matches:
$$7^7 - 7! = 823543 - 5040 = 818503.$$
So the automaton is behaving correctly.
4. Final verification
- We only track 7 states, so matrix exponentiation is exact and efficient.
- The $n=7$ sanity check matches perfectly.
- Modulo $10^9$ arithmetic is valid because only the last 9 digits are requested.
Answer: 423341841