Project Euler Problem 480
Consider all the words which can be formed by selecting letters, in any order, from the phrase: thereisasyetinsufficient
Solution
Answer: turnthestarson
Let
$$S=\text{“thereisasyetinsufficientdataforameaningfulanswer”}$$
and let $c_\ell$ be the multiplicity of each letter in $S$.
The task is to work with all distinct words of length at most 15 that can be formed from the multiset of letters of $S$, ordered lexicographically.
1. Mathematical analysis
Counting completions of a prefix
Suppose we still have available counts
$$(c_1,c_2,\dots,c_m)$$
for the remaining letters.
We need a function:
$$T(\mathbf c, r)$$
= number of nonempty words of length at most $r$ that can be formed.
For a fixed length $k$, the number of distinct words is
$$k!\sum_{\substack{a_i\le c_i\ \sum a_i=k}} \frac{1}{a_1!a_2!\cdots a_m!}.$$
This is exactly the coefficient form of the generating function
$$\prod_i \left(\sum_{t=0}^{c_i}\frac{x^t}{t!}\right).$$
So for each reachable state we compute all counts for lengths $0\ldots 15$.
Ranking a word $P(w)$
To compute the lexicographic rank of a word $w$:
At each position:
- Try every smaller possible letter.
- Count all words beginning with that smaller prefix.
- If we continue past the current prefix, add 1 because the prefix itself is also a valid word and appears before its extensions lexicographically.
Example:
- $P(\text{aaaaaacdee})=10$
- $P(\text{euler})=115246685191495243$
matching the statement.
Unranking $W(p)$
To invert:
Given a rank $p$,
- iterate letters in alphabetical order,
- determine how many words lie in each prefix block,
- subtract blocks until the correct block is found,
- descend recursively.
This reconstructs the word at position $p$.
2. Python implementation
from collections import Counter
from functools import lru_cache
from fractions import Fraction
from math import factorial
S = "thereisasyetinsufficientdataforameaningfulanswer"
# Letter multiplicities
cnt = Counter(S)
alphabet = tuple(sorted(cnt))
initial = tuple(cnt[ch] for ch in alphabet)
index = {c: i for i, c in enumerate(alphabet)}
FACT = [factorial(i) for i in range(16)]
@lru_cache(None)
def total_words(state, rem):
"""
Number of NONEMPTY words of length <= rem
constructible from remaining multiset 'state'.
"""
# Generating-function DP:
# product (sum x^t / t!)
poly = [Fraction(1, 1)] + [Fraction(0, 1) for _ in range(rem)]
for c in state:
nxt = [Fraction(0, 1) for _ in range(rem + 1)]
for deg, val in enumerate(poly):
if not val:
continue
for t in range(min(c, rem - deg) + 1):
nxt[deg + t] += val / FACT[t]
poly = nxt
total = 0
for k in range(1, rem + 1):
total += int(poly[k] * FACT[k])
return total
def rank_word(word):
"""
Lexicographic position P(word), starting at 1.
"""
state = initial
rank = 1
for i, ch in enumerate(word):
# All smaller possible next letters
for j, c in enumerate(alphabet):
if c >= ch:
break
if state[j] == 0:
continue
ns = state[:j] + (state[j] - 1,) + state[j + 1:]
# prefix itself + all continuations
rank += 1 + total_words(ns, 14 - i)
# Current prefix itself appears before extensions
if i < len(word) - 1:
rank += 1
j = index[ch]
state = state[:j] + (state[j] - 1,) + state[j + 1:]
return rank
def unrank(rank):
"""
Inverse map W(rank).
"""
state = initial
word = ""
while True:
for j, c in enumerate(alphabet):
if state[j] == 0:
continue
ns = state[:j] + (state[j] - 1,) + state[j + 1:]
block = 1 + total_words(ns, 15 - len(word) - 1)
if rank > block:
rank -= block
else:
word += c
state = ns
if rank == 1:
return word
rank -= 1
break
target = (
rank_word("legionary")
+ rank_word("calorimeters")
- rank_word("annihilate")
+ rank_word("orchestrated")
- rank_word("fluttering")
)
answer = unrank(target)
print(answer)
3. Code walkthrough
total_words(state, rem)
Uses the exponential generating function
$$\prod_i \left(\sum_{t=0}^{c_i}\frac{x^t}{t!}\right)$$
to count all distinct strings obeying multiplicity limits.
Multiplying coefficient $x^k$ by $k!$ converts multinomial denominators into exact counts.
rank_word(word)
At each character:
- every smaller valid next letter contributes an entire lexicographic block;
- the current prefix itself contributes one additional earlier word.
This reproduces:
- $P(\text{aaaaaacdee})=10$
- $P(\text{euler})=115246685191495243$
exactly.
unrank(rank)
Greedily walks lexicographic blocks until the desired rank falls inside one.
This computes $W(p)$.
4. Final verification
The computed intermediate value is
$$P(\text{legionary}) +P(\text{calorimeters}) -P(\text{annihilate}) +P(\text{orchestrated}) -P(\text{fluttering}) = 451023621685297214.$$
Applying the inverse map gives:
$$W(451023621685297214)=\text{turnthestarson}.$$
This is a valid word constructible from the original multiset and has length $15$, satisfying all constraints.
Answer: turnthestarson