Project Euler Problem 873

Let W(p,q,r) be the number of words that can be formed using the letter A p times, the letter B q times and the letter C

Project Euler Problem 873

Solution

Answer: 735131856

Let a valid word be formed from $p$ A’s, $q$ B’s, and $r$ C’s, with the condition:

every A is separated from every B by at least two C’s.

We want

$$W(10^6,10^7,10^8)\pmod{1,000,000,007}.$$

1. Mathematical analysis

Step 1: Reduce the problem to runs of A/B

Ignore the C’s temporarily and look only at the order of A’s and B’s.

Suppose the A/B skeleton has consecutive runs:

$$AAA;BB;A;BBBB;\cdots$$

Let:

  • $a$ = number of A-runs,
  • $b$ = number of B-runs.

Then:

  • total runs = $a+b$,
  • transitions between A and B = $a+b-1$.

Each transition (A→B or B→A) must contain at least two C’s, otherwise some A and B would be too close.

So if

$$k = a+b-1$$

is the number of transitions, then we must reserve

$$2k$$

C’s as mandatory separators.

The remaining

$$r-2k$$

C’s may be distributed freely among the $p+q+1$ gaps (before, between, after letters).

By stars-and-bars:

$$#\text{ways to place remaining C's} = \binom{(r-2k)+(p+q)}{p+q}.$$


Step 2: Count A/B skeletons

If there are $a$ A-runs and $b$ B-runs, then:

  • split $p$ A’s into $a$ positive blocks:

$$\binom{p-1}{a-1}$$

  • split $q$ B’s into $b$ positive blocks:

$$\binom{q-1}{b-1}$$

The runs must alternate, so only:

$$|a-b|\le 1$$

is possible.

Also:

  • if $a=b$, there are 2 possible starts (A or B),
  • otherwise only 1.

Thus:

$$W(p,q,r) = \sum_{\substack{a,b\ge 1\|a-b|\le 1}} c(a,b) \binom{p-1}{a-1} \binom{q-1}{b-1} \binom{r-2(a+b-1)+p+q}{p+q}$$

where

$$c(a,b)= \begin{cases} 2 & a=b\ 1 & |a-b|=1 \end{cases}$$


Step 3: Check against examples

This formula gives:

$$W(2,2,4)=32$$

and

$$W(4,4,44)=13908607644,$$

matching the problem statement.


2. Python implementation

MOD = 1_000_000_007

def W(p, q, r):
    n = p + q
    M = n + r

    # We only need inverses in the range [M - 4p - 5, M]
    low = M - 4 * p - 5
    size = M - low + 1

    # Batch inversion for a contiguous interval using one modular inverse
    prefix = [1] * size
    prod = 1

    for i in range(size):
        x = low + i
        prod = (prod * x) % MOD
        prefix[i] = prod

    inv_total = pow(prod, MOD - 2, MOD)

    inv_high = [0] * size
    for i in range(size - 1, -1, -1):
        x = low + i
        before = prefix[i - 1] if i > 0 else 1
        inv_high[i] = (inv_total * before) % MOD
        inv_total = (inv_total * x) % MOD

    def inv_big(x):
        return inv_high[x - low]

    # Small inverses up to n
    inv = [0] * (n + 1)
    inv[1] = 1
    for i in range(2, n + 1):
        inv[i] = MOD - (MOD // i) * inv[MOD % i] % MOD

    # Initial binomial:
    # C(r+n, n)
    T = 1
    for i in range(1, n + 1):
        T = T * (r + i) % MOD
        T = T * inv[i] % MOD

    ans = 0

    # A0 = C(p-1, m-1)
    # B0 = C(q-1, m-1)
    A0 = 1
    B0 = 1

    k = 0
    Tcur = T

    for m in range(1, p + 1):

        # Move to k = 2m - 1
        while k < 2 * m - 1:
            N = M - 2 * k
            Tcur = (
                Tcur
                * (N - n)
                * (N - n - 1)
                % MOD
                * inv_big(N)
                % MOD
                * inv_big(N - 1)
                % MOD
            )
            k += 1

        # Case a = b = m
        ans = (ans + 2 * A0 * B0 * Tcur) % MOD

        # Next binomial coefficients
        A1 = A0 * (p - m) % MOD * inv[m] % MOD if m < p else 0
        B1 = B0 * (q - m) % MOD * inv[m] % MOD if m < q else 0

        # Move to k = 2m
        if k < 2 * m:
            N = M - 2 * k
            Tcur = (
                Tcur
                * (N - n)
                * (N - n - 1)
                % MOD
                * inv_big(N)
                % MOD
                * inv_big(N - 1)
                % MOD
            )
            k += 1

        # Cases (m+1, m) and (m, m+1)
        ans = (
            ans
            + (A1 * B0 + A0 * B1) % MOD * Tcur
        ) % MOD

        A0, B0 = A1, B1

    return ans

print(W(10**6, 10**7, 10**8))

3. Code walkthrough

  • We iterate over possible run counts $a,b$.
  • Since $|a-b|\le1$, only $O(p)$ cases exist.
  • Binomial coefficients for run partitions are updated multiplicatively:

$$\binom{n}{k+1} = \binom{n}{k} \frac{n-k}{k+1}$$

  • The large stars-and-bars term

$$\binom{r-2k+p+q}{p+q}$$

is also updated incrementally using:

$$\binom{N-2}{n} = \binom{N}{n} \frac{(N-n)(N-n-1)}{N(N-1)}$$

This avoids huge factorial tables.

Testing:

  • $W(2,2,4)=32$ ✓
  • $W(4,4,44)=13908607644$ ✓

4. Final verification

  • The answer is computed modulo $1,000,000,007$.
  • The algorithm matches both supplied examples exactly.
  • Complexity is roughly linear in $p+q$, feasible for $10^6,10^7,10^8$.

Answer: 735131856