18.20 Pollard's Rho Factorization
Primality testing tells us whether a number is prime.
18.20 Pollard's Rho Factorization
Primality testing tells us whether a number is prime. Factorization asks a harder question:
If a number is composite, what are its prime factors?
For small integers, trial division is sufficient. For large integers, trial division becomes impractical because it requires checking divisors up to approximately:
√n
Pollard's Rho is one of the most important practical factorization algorithms. It is simple to implement, surprisingly effective, and often finds nontrivial factors of large composite numbers in a fraction of the time required by trial division.
The algorithm combines modular arithmetic, pseudorandom sequences, and GCD computations to discover factors without explicitly testing divisibility.
Problem
Given a composite integer:
n
find a nontrivial factor:
1 < d < n
Example:
n = 8051
Factorization:
8051 = 83 × 97
Pollard's Rho should discover either:
83
or:
97
without testing every possible divisor.
The Core Idea
Suppose:
p | n
for some unknown prime factor p.
If we generate a sequence modulo n:
x₀, x₁, x₂, ...
then the same sequence viewed modulo p must eventually repeat because there are only p possible residues.
Eventually:
xᵢ ≡ xⱼ (mod p)
for some:
i ≠ j
This implies:
p | (xᵢ - xⱼ)
Therefore:
gcd(|xᵢ - xⱼ|, n)
may reveal the hidden factor p.
The challenge is finding such pairs efficiently.
Pseudorandom Sequence
Pollard's Rho typically uses:
f(x) = x² + c mod n
where:
c ≠ 0, -2
Example:
x₀ = 2
and:
f(x)=x²+1 mod n
generate:
x₁ = 5
x₂ = 26
x₃ = 677
...
The sequence behaves somewhat randomly while remaining deterministic.
Why the Name "Rho"?
When the sequence is drawn as a functional graph, it typically resembles the Greek letter:
ρ
A path enters a cycle:
•
\\n •
\\n •
\\n •──•
|
•
|
•
|
•
The tail and loop form a rho-shaped structure.
The algorithm exploits this cyclic behavior.
Cycle Detection
We need a way to detect when two sequence values become equal modulo a hidden factor.
Pollard's Rho uses Floyd's cycle-finding algorithm.
Maintain:
tortoise
hare
The tortoise advances one step:
x ← f(x)
The hare advances two steps:
y ← f(f(y))
At each stage compute:
gcd(|x-y|, n)
If this GCD becomes a nontrivial divisor, we have found a factor.
Floyd's Cycle Algorithm
Recall the classic cycle-detection method:
x = f(x)
y = f(f(y))
If a cycle exists:
x == y
eventually occurs.
Pollard's Rho replaces exact equality with:
gcd(|x-y|, n)
which reveals factors before equality modulo n occurs.
Basic Implementation
from math import gcd
def pollards_rho(n):
if n % 2 == 0:
return 2
def f(x):
return (x * x + 1) % n
x = 2
y = 2
d = 1
while d == 1:
x = f(x)
y = f(f(y))
d = gcd(abs(x - y), n)
if d == n:
return None
return d
Example:
print(pollards_rho(8051))
Possible output:
97
or:
83
Both are valid.
Example Walkthrough
Factor:
8051
Start:
x = 2
y = 2
Function:
f(x)=x²+1 mod 8051
Iterations:
x=5
y=26
Compute:
gcd(|5-26|,8051)
=
gcd(21,8051)
=
1
Continue.
Eventually:
gcd(|x-y|,8051)
=
97
A nontrivial factor appears.
The factor emerges automatically through the modular collisions.
Why It Works
Suppose:
n = pq
with unknown prime factors.
The sequence modulo p enters a cycle.
The sequence modulo q enters a different cycle.
At some point:
x ≡ y (mod p)
but:
x ≠ y (mod q)
Then:
p | (x-y)
but:
q ∤ (x-y)
Therefore:
gcd(x-y,n)=p
The hidden factor is exposed.
Handling Failure
Sometimes:
gcd(|x-y|,n)=n
This means the algorithm failed to isolate a factor.
The usual solution is restarting with different parameters.
Example:
f(x)=x²+3
or:
f(x)=x²+5
or a different starting value.
Most failures disappear after a few retries.
Improved Implementation
import random
from math import gcd
def pollards_rho_retry(n):
if n % 2 == 0:
return 2
while True:
c = random.randrange(1, n)
x = random.randrange(2, n)
y = x
d = 1
while d == 1:
x = (x * x + c) % n
y = (y * y + c) % n
y = (y * y + c) % n
d = gcd(abs(x - y), n)
if d != n:
return d
This version is far more robust.
Complete Factorization
Pollard's Rho finds one factor.
To obtain a full prime factorization:
- Test whether
nis prime. - If prime, stop.
- Otherwise find a factor
d. - Recursively factor:
d
and:
n/d
Implementation:
def factor(n, result):
if n == 1:
return
if is_prime_64(n):
result.append(n)
return
d = pollards_rho_retry(n)
factor(d, result)
factor(n // d, result)
Wrapper:
def factorize(n):
result = []
factor(n, result)
result.sort()
return result
Example:
print(factorize(8051))
Output:
[83, 97]
Complexity
The expected running time for finding a factor p is approximately:
O(√p)
where:
p
is the smallest prime factor.
This is dramatically better than trial division:
O(√n)
For numbers with relatively small factors, Pollard's Rho is extremely fast.
Comparison with Trial Division
Consider:
n = 1000003 × 1000033
Trial division requires testing divisors up to roughly:
10^6
Pollard's Rho typically discovers a factor after roughly:
10^3
to:
10^4
operations.
The difference becomes enormous as numbers grow.
Brent's Improvement
Richard Brent proposed a refinement that reduces GCD computations.
Instead of checking every iteration:
gcd(|x-y|, n)
the algorithm accumulates products and performs fewer GCD calls.
Brent's version is often faster in practice and is commonly used in competitive programming libraries and factorization tools.
The underlying idea remains identical.
Pollard's Rho and Miller-Rabin
Modern integer factorization routines typically combine:
Miller-Rabin
for primality testing and:
Pollard's Rho
for factor discovery.
Workflow:
Input n
↓
Miller-Rabin
↓
Prime? → Done
↓
Pollard's Rho
↓
Split n
↓
Recurse
This combination efficiently factors 64-bit integers.
Applications
Integer Factorization
The primary use case.
Cryptanalysis
Factoring weak RSA moduli.
Number-Theoretic Algorithms
Many arithmetic functions require prime factorization.
Competitive Programming
Fast factorization of 64-bit integers.
Mathematical Software
Computer algebra systems frequently use Pollard's Rho as a core component.
Common Mistakes
Forgetting Prime Testing
Pollard's Rho should not be applied recursively to primes.
Use Miller-Rabin first.
Returning n
If:
d = n
the attempt failed.
Restart with different parameters.
Using Fixed Parameters
Some inputs perform poorly with:
f(x)=x²+1
Always support retries.
Ignoring Small Factors
Check:
2
3
5
and other small primes first.
This removes easy cases immediately.
Assuming Deterministic Runtime
Pollard's Rho is randomized.
Runtime varies between executions.
Testing
Basic cases:
def test_pollards_rho():
d = pollards_rho_retry(8051)
assert 8051 % d == 0
assert d not in (1, 8051)
Full factorization:
def test_factorization():
assert factorize(8051) == [83, 97]
assert factorize(60) == [
2,
2,
3,
5,
]
Prime inputs:
def test_prime_input():
assert factorize(101) == [101]
Random verification:
def reconstruct(factors):
result = 1
for x in factors:
result *= x
return result
Verify:
reconstruct(factorize(n)) == n
for a large collection of test values.
Takeaway
Pollard's Rho is a practical randomized factorization algorithm that exploits modular cycles and GCD computations to discover nontrivial factors. By generating a pseudorandom sequence and detecting collisions modulo hidden prime factors, it often finds factors in roughly O(√p) time, where p is the smallest prime factor. Combined with Miller-Rabin primality testing, Pollard's Rho forms the foundation of many modern integer-factorization routines and is the standard approach for factoring 64-bit integers efficiently.