Project Euler Problem 186
Here are the records from a busy telephone system with one million users: | RecNr | Caller | Called | |:----------:|:---
Solution
Answer: 2325629
Let the users be the vertices of a graph on $10^6$ nodes.
Each successful phone call adds an undirected edge between two users.
The question asks:
After how many successful calls does the connected component containing the Prime Minister (PM), number $524287$, first contain at least $99%$ of all users?
Since there are $10^6$ users, we seek the first moment when the PM’s connected component has size at least
$$0.99 \times 10^6 = 990000.$$
The natural data structure is therefore a disjoint-set union structure (Union–Find).
1. Mathematical analysis
The sequence is generated by the Lagged Fibonacci Generator:
For $1 \le k \le 55$,
$$S_k = (100003 - 200003k + 300007k^3)\bmod 1000000.$$
For $k \ge 56$,
$$S_k = (S_{k-24} + S_{k-55}) \bmod 1000000.$$
Each call record is
$$\operatorname{Caller}(n)=S_{2n-1}, \qquad \operatorname{Called}(n)=S_{2n}.$$
A call is ignored if caller = called (misdial).
Otherwise, we add an edge between those two users.
So the process is:
- generate pairs $(a,b)$,
- ignore if $a=b$,
- union the connected components of $a$ and $b$,
- track the size of the PM’s component,
- stop once it reaches $990000$.
Why Union–Find works
Suppose each user belongs to a connected component.
When a successful call occurs between $a$ and $b$:
- if already connected, nothing changes,
- otherwise the two components merge.
Union–Find supports:
find(x)= representative of component containing $x$,union(x,y)= merge components,- component size tracking.
With path compression and union-by-size/rank, each operation is effectively constant time:
$$O(\alpha(n)),$$
where $\alpha$ is the inverse Ackermann function.
This is ideal for millions of operations.
2. Python implementation
MOD = 1000000
PM = 524287
TARGET = 990000
# ------------------------------------------------------------
# Lagged Fibonacci Generator
# ------------------------------------------------------------
S = [0] * 56 # use indices 1..55
for k in range(1, 56):
S[k] = (100003 - 200003 * k + 300007 * k * k * k) % MOD
idx = 1
def next_S():
global idx
# First 55 values
if idx <= 55:
val = S[idx]
else:
# recurrence
val = (S[(idx - 24 - 1) % 55 + 1] +
S[(idx - 55 - 1) % 55 + 1]) % MOD
# overwrite cyclic buffer
S[(idx - 1) % 55 + 1] = val
idx += 1
return val
# ------------------------------------------------------------
# Union-Find
# ------------------------------------------------------------
parent = list(range(MOD))
size = [1] * MOD
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
def union(a, b):
ra = find(a)
rb = find(b)
if ra == rb:
return
# union by size
if size[ra] < size[rb]:
ra, rb = rb, ra
parent[rb] = ra
size[ra] += size[rb]
# ------------------------------------------------------------
# Main simulation
# ------------------------------------------------------------
successful_calls = 0
while True:
caller = next_S()
called = next_S()
# misdial
if caller == called:
continue
successful_calls += 1
union(caller, called)
pm_root = find(PM)
if size[pm_root] >= TARGET:
print(successful_calls)
break
3. Code walkthrough
Generator setup
S = [0] * 56
We store the last 55 values cyclically.
Initial values:
for k in range(1, 56):
S[k] = ...
implement the cubic formula exactly.
Recurrence generation
For $k \ge 56$:
val = (S[k-24] + S[k-55]) % MOD
implemented via cyclic indexing.
Each call record consumes two generated numbers.
Union–Find arrays
parent = list(range(MOD))
size = [1] * MOD
Initially each user is isolated.
find
while parent[x] != x:
parent[x] = parent[parent[x]]
This is path compression.
union
We merge smaller tree into larger:
if size[ra] < size[rb]:
ra, rb = rb, ra
which keeps trees shallow.
Main loop
Each iteration:
caller = next_S()
called = next_S()
If equal:
continue
Otherwise:
successful_calls += 1
union(caller, called)
Then check the PM component size:
if size[find(PM)] >= 990000:
and stop.
4. Final verification
Some sanity checks:
- There are $10^6$ users.
- We stop at $990000$ users connected to the PM.
- The answer should be several million calls, consistent with random graph connectivity thresholds.
- Union–Find handles this efficiently in near-linear time.
Running the computation yields:
$$2325629$$
This is the known correct Project Euler result.
Answer: 2325629