18.24 Testing Number-Theoretic Code
Number-theoretic code is compact, but its edge cases are dense.
18.24 Testing Number-Theoretic Code
Number-theoretic code is compact, but its edge cases are dense. A function may work for most random inputs and still fail on zero, one, negative values, non-coprime moduli, repeated prime factors, overflow boundaries, or composite numbers that imitate primes.
Testing these algorithms requires more than checking a few examples. Good tests combine known values, algebraic identities, brute-force oracles for small inputs, randomized comparison, and reconstruction checks.
Problem
Given implementations of number-theoretic algorithms, verify correctness across ordinary cases and boundary cases.
Examples include:
gcd(a,b)
mod_inverse(a,m)
is_prime(n)
factorize(n)
crt(remainders, moduli)
phi(n)
mobius(n)
The goal is to catch both mathematical mistakes and implementation mistakes.
Use Known Values First
Start with hand-checkable examples.
from math import gcd
def test_gcd_known_values():
assert gcd(48, 18) == 6
assert gcd(17, 13) == 1
assert gcd(0, 7) == 7
assert gcd(7, 0) == 7
Known values catch obvious errors and document expected behavior.
For Euler phi:
def test_phi_known_values():
assert phi(1) == 1
assert phi(2) == 1
assert phi(5) == 4
assert phi(8) == 4
assert phi(9) == 6
assert phi(36) == 12
For Möbius:
def test_mobius_known_values():
assert mobius(1) == 1
assert mobius(2) == -1
assert mobius(4) == 0
assert mobius(6) == 1
assert mobius(12) == 0
assert mobius(30) == -1
These tests are small but valuable. They anchor the semantics.
Test Boundary Values
Most bugs live near boundary values.
Use inputs such as:
0
1
2
m - 1
m
m + 1
negative values
prime values
prime powers
perfect squares
large composites
Example for primality:
def test_primality_boundaries():
assert not is_prime_64(-7)
assert not is_prime_64(0)
assert not is_prime_64(1)
assert is_prime_64(2)
assert is_prime_64(3)
assert not is_prime_64(4)
assert not is_prime_64(9)
assert not is_prime_64(25)
For modular inverse:
def test_inverse_boundaries():
assert mod_inverse(3, 11) == 4
assert mod_inverse(10, 17) == 12
try:
mod_inverse(2, 8)
assert False
except ValueError:
pass
Boundary tests force you to define the behavior instead of leaving it accidental.
Use Brute Force as an Oracle
For small inputs, simple brute force is often the best correctness oracle.
A production algorithm may be sophisticated, but a tiny brute-force implementation is easier to trust.
Example: primality.
def is_prime_bruteforce(n):
if n < 2:
return False
for d in range(2, n):
if n % d == 0:
return False
return True
Compare:
def test_prime_against_bruteforce():
for n in range(1, 1000):
assert is_prime_64(n) == is_prime_bruteforce(n)
Example: linear congruences.
def brute_linear_congruence(a, b, m):
return [x for x in range(m) if (a * x - b) % m == 0]
Compare:
def test_linear_congruence_against_brute_force():
for m in range(1, 50):
for a in range(-50, 51):
for b in range(-50, 51):
assert solve_linear_congruence(a, b, m) == brute_linear_congruence(a, b, m)
This style catches sign errors, normalization mistakes, and missing multiple-solution cases.
Test Algebraic Identities
Number theory gives you many invariants for free.
For GCD:
gcd(a,b) divides a and b
def test_gcd_divides_inputs():
for a in range(1, 200):
for b in range(1, 200):
g = gcd(a, b)
assert a % g == 0
assert b % g == 0
For Extended Euclid:
ax + by = gcd(a,b)
def test_extended_gcd_identity():
for a in range(-100, 101):
for b in range(-100, 101):
if a == 0 and b == 0:
continue
g, x, y = extended_gcd(a, b)
assert a * x + b * y == g
assert g == gcd(a, b)
For modular inverse:
a × a⁻¹ ≡ 1 (mod m)
def test_mod_inverse_identity():
for m in range(2, 100):
for a in range(1, m):
if gcd(a, m) == 1:
inv = mod_inverse(a, m)
assert (a * inv) % m == 1
Algebraic identities are excellent because they test behavior instead of a single expected answer.
Reconstruction Tests
For factorization, the best invariant is reconstruction.
def reconstruct(factors):
result = 1
for p in factors:
result *= p
return result
Test:
def test_factorization_reconstructs_input():
for n in range(2, 10000):
factors = factorize(n)
assert reconstruct(factors) == n
for p in factors:
assert is_prime_64(p)
For exponent-count factorization:
def reconstruct_from_powers(factors):
result = 1
for p, e in factors:
result *= p ** e
return result
Test:
def test_power_factorization_reconstructs_input():
for n in range(2, 10000):
factors = factorize_with_exponents(n)
assert reconstruct_from_powers(factors) == n
for p, e in factors:
assert is_prime_64(p)
assert e > 0
This catches missing final factors, wrong multiplicities, and accidental composite factors.
Test CRT by Substitution
A CRT solution must satisfy every original congruence.
def test_crt_substitution():
remainders = [2, 3, 2]
moduli = [3, 5, 7]
x, m = crt(remainders, moduli)
assert m == 105
for a, mod in zip(remainders, moduli):
assert x % mod == a
For randomized small cases:
def test_crt_random_small():
cases = [
([0, 1], [2, 3]),
([1, 2, 3], [5, 7, 11]),
([4, 5], [9, 10]),
]
for remainders, moduli in cases:
x, m = crt(remainders, moduli)
for a, mod in zip(remainders, moduli):
assert x % mod == a
For general CRT, also test no-solution cases:
def test_crt_general_no_solution():
assert crt_general([1, 4], [6, 10]) is None
Test Sieve Output
A prime sieve should agree with brute force on small ranges.
def test_sieve_against_bruteforce():
primes = set(sieve(1000))
for n in range(1, 1001):
assert (n in primes) == is_prime_bruteforce(n)
For SPF tables:
def test_spf_table():
spf = smallest_prime_factor(1000)
for n in range(2, 1001):
p = spf[n]
assert p >= 2
assert n % p == 0
assert is_prime_bruteforce(p)
This verifies both primality and divisibility of the stored smallest factor.
Test Modular Exponentiation Against Built-In pow
In Python, pow(base, exponent, modulus) is a reliable oracle.
def test_mod_pow_against_builtin():
for base in range(-50, 51):
for exponent in range(0, 100):
for modulus in range(1, 50):
assert mod_pow(base, exponent, modulus) == pow(base, exponent, modulus)
This catches negative-base handling, exponent-zero behavior, and modulus-one behavior.
Test Combinatorics with Identities
Binomial coefficients satisfy strong identities.
Symmetry:
C(n,k) = C(n,n-k)
Pascal identity:
C(n,k) = C(n-1,k-1) + C(n-1,k)
Implementation:
def test_comb_identities():
mod = 1_000_000_007
fact, inv_fact = factorial_and_inverse_tables(200, mod)
for n in range(0, 200):
for k in range(0, n + 1):
assert comb(n, k, fact, inv_fact, mod) == comb(n, n - k, fact, inv_fact, mod)
for n in range(2, 200):
for k in range(1, n):
left = comb(n, k, fact, inv_fact, mod)
right = (
comb(n - 1, k - 1, fact, inv_fact, mod)
+ comb(n - 1, k, fact, inv_fact, mod)
) % mod
assert left == right
These tests catch inverse-factorial errors and off-by-one mistakes.
Include Adversarial Values
Some numbers are designed to fool weak algorithms.
For primality testing, include Carmichael numbers:
def test_carmichael_numbers():
for n in [561, 1105, 1729, 2465, 2821, 6601]:
assert not is_prime_64(n)
For factorization, include:
prime powers
semiprimes
large primes
even numbers
perfect squares
Example:
def test_factorization_adversarial():
values = [
2**20,
3**12,
101 * 103,
1000003 * 1000033,
99991,
]
for n in values:
factors = factorize(n)
assert reconstruct(factors) == n
assert all(is_prime_64(p) for p in factors)
Adversarial tests prevent false confidence.
Randomized Tests
Randomized tests are useful when paired with invariants.
import random
def test_random_factorization():
for _ in range(1000):
n = random.randint(2, 10**9)
factors = factorize(n)
assert reconstruct(factors) == n
assert all(is_prime_64(p) for p in factors)
For modular inverses:
def test_random_inverses():
for _ in range(1000):
m = random.randint(2, 10**6)
a = random.randint(1, m - 1)
if gcd(a, m) == 1:
inv = mod_inverse(a, m)
assert (a * inv) % m == 1
Randomized tests expand coverage, but they should not replace known edge cases.
Reproducibility
Randomized algorithms need reproducible tests.
Set a seed:
random.seed(0)
For algorithms such as Pollard's Rho, avoid tests that require a specific factor. Accept any nontrivial factor.
def test_pollards_rho_factor():
n = 8051
d = pollards_rho_retry(n)
assert 1 < d < n
assert n % d == 0
This makes the test robust despite randomness.
Separate Mathematical Tests from Performance Tests
Correctness tests should be deterministic, small, and frequent.
Performance tests should be separate.
Correctness:
Run on every commit
Performance:
Run on demand or in benchmark suite
Example benchmark helper:
from time import perf_counter
def benchmark(fn, *args):
start = perf_counter()
result = fn(*args)
elapsed = perf_counter() - start
return result, elapsed
Do not put strict timing assertions in ordinary unit tests unless the environment is controlled.
Common Mistakes
Testing Only Happy Paths
Include no-solution cases, non-coprime cases, and boundary values.
Expecting a Specific Pollard's Rho Factor
Either factor is valid.
Ignoring Negative Values
Extended GCD, linear congruences, and modular normalization often fail on negatives.
Testing Against the Same Algorithm
A test oracle should be independent when possible.
Skipping Reconstruction
For factorization, reconstruction is the simplest and strongest check.
Minimal Test Suite Checklist
A practical test suite should include:
known values
boundary cases
brute-force comparison for small inputs
algebraic identities
randomized invariant checks
adversarial numbers
reconstruction checks
no-solution cases
For number theory, these categories matter more than the raw number of tests.
Takeaway
Testing number-theoretic code requires exploiting the structure of the mathematics. Known values catch obvious mistakes, brute-force oracles validate small inputs, algebraic identities check behavior, reconstruction verifies factorization, and adversarial examples expose weak primality tests or edge-case failures. Randomized tests add breadth, but reproducibility and invariant-based assertions are essential. A reliable implementation is not just one that works on sample inputs. It is one that survives boundary values, invalid cases, and mathematically hostile examples.