Project Euler Problem 657
In the context of formal languages, any finite sequence of letters of a given alphabet Sigma is called a word over Sigma
Solution
Answer: 219493139
Using inclusion–exclusion, the number of incomplete words of exact length $k$ over an alphabet of size $\alpha$ is
$$\sum_{j=1}^{\alpha}(-1)^{j+1}\binom{\alpha}{j}(\alpha-j)^k.$$
Summing over all lengths $0\le k\le n$,
$$I(\alpha,n) = \sum_{j=1}^{\alpha} (-1)^{j+1}\binom{\alpha}{j} \sum_{k=0}^{n}(\alpha-j)^k.$$
Let $i=\alpha-j$. Then
$$I(\alpha,n) = \sum_{i=0}^{\alpha-1} (-1)^{\alpha-i+1} \binom{\alpha}{i} \cdot \frac{i^{n+1}-1}{i-1},$$
with the special cases:
- $i=0$: contribution $=1$,
- $i=1$: contribution $=n+1$.
For the target values
$$\alpha=10^7,\qquad n=10^{12},$$
we compute the sum modulo $10^9+7$, updating binomial coefficients iteratively and using fast modular exponentiation. This runs in about one second in optimized code. A direct implementation gives the result:
Answer: 219493139