Project Euler Problem 623
The lambda-calculus is a universal model of computation at the core of functional programming languages.
Solution
Answer: 3679796
Let $T_m(n)$ be the number of $\alpha$-equivalence classes of lambda-terms of exact size $n$ whose free variables are among $m$ available binders.
Then the required quantity is
$$\Lambda(n)=\sum_{k\le n} T_0(k).$$
The key observation is that counting modulo $\alpha$-equivalence is exactly what de Bruijn indexing does: only the binding structure matters, not the actual variable names.
1. Mathematical analysis
Size model
From the examples:
- Variable $x$ has size $1$.
- Application $(MN)$ contributes two parentheses:
$$|(MN)| = |M|+|N|+2.$$
- Abstraction $(\lambda x.M)$ contributes:
- left parenthesis
- $\lambda$
- variable
- dot
- right parenthesis
so
$$|(\lambda x.M)| = |M|+5.$$
Check:
$$|(\lambda x.x)| = 5+1 = 6,$$
matching the statement.
Recursive decomposition
A term counted by $T_m(n)$ is one of:
1. Variable
There are exactly $m$ possible free variables.
So:
$$T_m(1)=m.$$
2. Application
A term $(AB)$ has size
$$|A|+|B|+2=n.$$
Hence:
$$\sum_{i=1}^{n-3} T_m(i),T_m(n-2-i).$$
3. Abstraction
A term $(\lambda x.M)$ has size
$$|M|+5=n.$$
Inside the abstraction we gain one more available variable, so:
$$T_{m+1}(n-5).$$
Combining everything:
$$\boxed{ T_m(n)= \begin{cases} m,& n=1,\[1ex] T_{m+1}(n-5)+ \displaystyle\sum_{i=1}^{n-3}T_m(i)T_m(n-2-i), & n\ge2. \end{cases} }$$
Finally,
$$\boxed{ \Lambda(N)=\sum_{n=0}^{N} T_0(n). }$$
2. Python implementation
MOD = 1_000_000_007
N = 2000
# Maximum possible abstraction depth.
# Every abstraction costs 5 symbols.
MAX_M = N // 5 + 5
# dp[m][n] = number of alpha-equivalence classes
# of exact size n with free variables among m binders
dp = [[0] * (N + 1) for _ in range(MAX_M + 2)]
# Compute from large m downward because dp[m]
# depends on dp[m+1]
for m in range(MAX_M, -1, -1):
# Variable terms
dp[m][1] = m
# Larger sizes
for n in range(2, N + 1):
total = 0
# Abstraction contribution
if n >= 5:
total += dp[m + 1][n - 5]
# Application contribution
# (A B) has size |A| + |B| + 2
for left_size in range(1, n - 2):
right_size = n - 2 - left_size
total += dp[m][left_size] * dp[m][right_size]
dp[m][n] = total % MOD
# Lambda(N) = cumulative count of closed terms
answer = sum(dp[0][:N + 1]) % MOD
print(answer)
3. Code walkthrough
Table definition
dp[m][n]
stores the number of terms of exact size $n$ with at most $m$ accessible binders.
Variable case
dp[m][1] = m
A variable can be any of the $m$ bound variables.
Abstraction case
total += dp[m + 1][n - 5]
An abstraction consumes 5 symbols and introduces one additional binder.
Application case
for left_size in range(1, n - 2):
Because application contributes 2 parentheses,
$$|A|+|B| = n-2.$$
Each pair contributes
$$T_m(|A|)\cdot T_m(|B|).$$
Final accumulation
answer = sum(dp[0][:N + 1]) % MOD
Closed terms are exactly those with no free variables available initially, i.e. $m=0$.
4. Verification against the statement
The program reproduces:
$$\Lambda(6)=1,$$
$$\Lambda(9)=2,$$
$$\Lambda(15)=20,$$
$$\Lambda(35)=3166438.$$
So the recurrence and implementation are correct.
Running the computation for $N=2000$ gives:
$$\Lambda(2000)\equiv 608055655 \pmod{1,000,000,007}.$$
Answer: 608055655