Project Euler Problem 328
We are trying to find a hidden number selected from the set of integers 1, 2, dots, n by asking questions.
Solution
Answer: 260511850222
Let
$$F(a,b)$$
denote the minimum worst–case cost needed to identify a hidden number in the interval $[a,b]$.
If we first ask $x$ ($a\le x\le b$), then:
- if the hidden number is lower, we still must solve $[a,x-1]$,
- if higher, we must solve $[x+1,b]$,
- and in either case we already paid cost $x$.
Therefore
$$F(a,b) = \min_{x\in[a,b]} \left( x+\max(F(a,x-1),F(x+1,b)) \right).$$
The desired function is
$$C(n)=F(1,n).$$
For small values:
$$C(1)=0,\quad C(2)=1,\quad C(3)=2,\quad C(8)=12.$$
The direct dynamic program is far too slow for $n=200000$, because it would require $O(n^3)$ work over all intervals.
Key structural insight
For optimal play, the first question is always very near the upper end of the interval.
If the first guess is $k$, then the right branch $[k+1,n]$ is handled by a nearly complete binary decision tree.
Those “complete-tree” costs can be represented in the affine form
$$m\cdot a + c$$
for intervals beginning at $a$.
This allows the right-side costs to be evaluated in $O(1)$.
Empirically (and provably), the optimal distance
$$d(n)=n-k(n)$$
changes only near powers of two, so only a tiny set of candidate pivots must be tested for each $n$.
This reduces the computation to essentially linear time.
Python implementation
from bisect import bisect_left
from functools import lru_cache
# Numbers of the form 2^t - 1
G = []
x = 1
while x < 400000:
G.append(x)
x = x * 2 + 1
@lru_cache(None)
def complete_coeff(L):
"""
Return (m, c) such that the complete-tree strategy
on an interval [a, a+L] has worst-case cost:
m*a + c
"""
if L <= 0:
return (0, 0)
if L == 1:
return (1, 0)
if L == 2:
return (1, 1)
idx = bisect_left(G, L)
t = G[idx - 1]
# Pivot offset
d = min(t, L - (t - 1) // 2)
left_len = d - 1
right_len = L - d - 1
m1, c1 = complete_coeff(left_len)
m2, c2 = complete_coeff(right_len)
right_const = c2 + m2 * (d + 1)
# Dominating branch
if m1 > m2:
md, cd = m1, c1
elif m2 > m1:
md, cd = m2, right_const
else:
if c1 >= right_const:
md, cd = m1, c1
else:
md, cd = m2, right_const
return (md + 1, d + cd)
def complete_cost(lo, hi):
L = hi - lo
m, c = complete_coeff(L)
return m * lo + c
def solve(limit=200000):
C = [0] * (limit + 1)
# Small exact values
seed = {
1: 0,
2: 1,
3: 2,
4: 4,
5: 6,
6: 8
}
for n, v in seed.items():
if n <= limit:
C[n] = v
if limit <= 6:
return sum(C[1:limit + 1])
# Current optimal distance n-k
dist = 3
# Candidate perturbations
q = [0]
p = 4
while p <= 131072:
q.append(p)
p *= 2
i = 2
total_sum = sum(C[1:7])
for n in range(7, limit + 1):
if dist > 4 ** i:
i += 1
best_cost = None
best_dist = None
for delta in q[:i]:
k = n - dist - delta
if not (0 < k < n):
continue
left = C[k - 1]
right = complete_cost(k + 1, n)
worst = max(left, right)
cost = k + worst
if best_cost is None or cost < best_cost:
best_cost = cost
best_dist = dist + delta
C[n] = best_cost
dist = best_dist
total_sum += best_cost
return total_sum
print(solve(100)) # 17575
print(solve(200000))
Code walkthrough
Precomputing $2^t-1$
G = []
x = 1
while x < 400000:
G.append(x)
x = x * 2 + 1
These are the sizes of perfect binary trees and are used to build the optimal “complete-tree” strategies.
complete_coeff(L)
This recursively computes the affine formula for the complete-tree strategy over intervals of length $L$.
Instead of recomputing every subtree separately, we memoize results with:
@lru_cache(None)
complete_cost(lo, hi)
Evaluates the affine expression for a concrete interval.
Main DP
For each $n$:
- try only a very small set of candidate pivots,
- compute
$$k+\max(C(k-1),\text{right-cost})$$
- take the minimum.
Because the optimal pivot moves slowly, only logarithmically many candidates are tested.
Verification
The program reproduces all values given in the statement:
$$C(1)=0,\quad C(2)=1,\quad C(3)=2,\quad C(8)=12,$$
and
$$\sum_{n=1}^{100} C(n)=17575.$$
Running the same algorithm for $200000$ gives:
$$\sum_{n=1}^{200000} C(n)=260511850222.$$
Answer: 260511850222