Project Euler Problem 186

Here are the records from a busy telephone system with one million users: | RecNr | Caller | Called | |:----------:|:---

Project Euler Problem 186

Solution

Answer: 2325629

Let the users be the vertices of a graph on $10^6$ nodes.

Each successful phone call adds an undirected edge between two users.

The question asks:

After how many successful calls does the connected component containing the Prime Minister (PM), number $524287$, first contain at least $99%$ of all users?

Since there are $10^6$ users, we seek the first moment when the PM’s connected component has size at least

$$0.99 \times 10^6 = 990000.$$

The natural data structure is therefore a disjoint-set union structure (Union–Find).


1. Mathematical analysis

The sequence is generated by the Lagged Fibonacci Generator:

For $1 \le k \le 55$,

$$S_k = (100003 - 200003k + 300007k^3)\bmod 1000000.$$

For $k \ge 56$,

$$S_k = (S_{k-24} + S_{k-55}) \bmod 1000000.$$

Each call record is

$$\operatorname{Caller}(n)=S_{2n-1}, \qquad \operatorname{Called}(n)=S_{2n}.$$

A call is ignored if caller = called (misdial).

Otherwise, we add an edge between those two users.

So the process is:

  • generate pairs $(a,b)$,
  • ignore if $a=b$,
  • union the connected components of $a$ and $b$,
  • track the size of the PM’s component,
  • stop once it reaches $990000$.

Why Union–Find works

Suppose each user belongs to a connected component.

When a successful call occurs between $a$ and $b$:

  • if already connected, nothing changes,
  • otherwise the two components merge.

Union–Find supports:

  • find(x) = representative of component containing $x$,
  • union(x,y) = merge components,
  • component size tracking.

With path compression and union-by-size/rank, each operation is effectively constant time:

$$O(\alpha(n)),$$

where $\alpha$ is the inverse Ackermann function.

This is ideal for millions of operations.


2. Python implementation

MOD = 1000000
PM = 524287
TARGET = 990000

# ------------------------------------------------------------
# Lagged Fibonacci Generator
# ------------------------------------------------------------

S = [0] * 56  # use indices 1..55

for k in range(1, 56):
    S[k] = (100003 - 200003 * k + 300007 * k * k * k) % MOD

idx = 1

def next_S():
    global idx

    # First 55 values
    if idx <= 55:
        val = S[idx]
    else:
        # recurrence
        val = (S[(idx - 24 - 1) % 55 + 1] +
               S[(idx - 55 - 1) % 55 + 1]) % MOD

        # overwrite cyclic buffer
        S[(idx - 1) % 55 + 1] = val

    idx += 1
    return val

# ------------------------------------------------------------
# Union-Find
# ------------------------------------------------------------

parent = list(range(MOD))
size = [1] * MOD

def find(x):
    while parent[x] != x:
        parent[x] = parent[parent[x]]
        x = parent[x]
    return x

def union(a, b):
    ra = find(a)
    rb = find(b)

    if ra == rb:
        return

    # union by size
    if size[ra] < size[rb]:
        ra, rb = rb, ra

    parent[rb] = ra
    size[ra] += size[rb]

# ------------------------------------------------------------
# Main simulation
# ------------------------------------------------------------

successful_calls = 0

while True:

    caller = next_S()
    called = next_S()

    # misdial
    if caller == called:
        continue

    successful_calls += 1

    union(caller, called)

    pm_root = find(PM)

    if size[pm_root] >= TARGET:
        print(successful_calls)
        break

3. Code walkthrough

Generator setup

S = [0] * 56

We store the last 55 values cyclically.

Initial values:

for k in range(1, 56):
    S[k] = ...

implement the cubic formula exactly.


Recurrence generation

For $k \ge 56$:

val = (S[k-24] + S[k-55]) % MOD

implemented via cyclic indexing.

Each call record consumes two generated numbers.


Union–Find arrays

parent = list(range(MOD))
size = [1] * MOD

Initially each user is isolated.


find

while parent[x] != x:
    parent[x] = parent[parent[x]]

This is path compression.


union

We merge smaller tree into larger:

if size[ra] < size[rb]:
    ra, rb = rb, ra

which keeps trees shallow.


Main loop

Each iteration:

caller = next_S()
called = next_S()

If equal:

continue

Otherwise:

successful_calls += 1
union(caller, called)

Then check the PM component size:

if size[find(PM)] >= 990000:

and stop.


4. Final verification

Some sanity checks:

  • There are $10^6$ users.
  • We stop at $990000$ users connected to the PM.
  • The answer should be several million calls, consistent with random graph connectivity thresholds.
  • Union–Find handles this efficiently in near-linear time.

Running the computation yields:

$$2325629$$

This is the known correct Project Euler result.

Answer: 2325629