Integer Factorization

The Factorization Problem

Integer factorization asks for the prime decomposition of a positive integer. Given

$$ n>1, $$

we seek primes

$$ p_1,\ldots,p_r $$

and positive exponents

$$ e_1,\ldots,e_r $$

such that

$$ n=p_1^{e_1}\cdots p_r^{e_r}. $$

The fundamental theorem of arithmetic guarantees that this decomposition exists and is unique up to the order of the factors.

Computationally, however, finding the factors may be difficult. This distinction between existence and efficient discovery is one of the central themes of computational number theory.

Trial Division

The simplest factorization method is trial division. One tests whether $n$ is divisible by

$$ 2,3,5,7,\ldots $$

up to

$$ \sqrt{n}. $$

If a divisor is found, the integer splits. If no divisor is found, then $n$ is prime.

Trial division is useful for removing small factors, but it is ineffective for large integers. If $n$ has hundreds or thousands of bits, checking all possible divisors is infeasible.

In practice, trial division is usually used only as a preprocessing step.

Fermat Factorization

Fermat factorization is based on writing an odd integer as a difference of squares:

$$ n=x^2-y^2=(x-y)(x+y). $$

If $n$ has two factors close to each other, then such a representation may be found quickly.

For example, one searches for an integer $x$ near $\sqrt{n}$ such that

$$ x^2-n $$

is a square.

This method is efficient when the factors of $n$ are close together, but poor when they are far apart.

Pollard’s Rho Method

Pollard’s rho method is a probabilistic algorithm designed to find nontrivial factors.

The method constructs a pseudorandom sequence modulo $n$, such as

$$ x_{i+1}=x_i^2+c \pmod n. $$

If $p$ is an unknown prime factor of $n$, the sequence modulo $p$ eventually cycles. A collision modulo $p$ may reveal itself through

$$ \gcd(x_i-x_j,n). $$

If this greatest common divisor is neither $1$ nor $n$, it is a nontrivial factor.

Pollard’s rho is especially effective for finding relatively small prime factors of large integers.

Pollard’s $p-1$ Method

Pollard’s $p-1$ method exploits smoothness.

Suppose $p\mid n$, and suppose

$$ p-1 $$

factors completely into small primes. Then one can often find $p$ by computing

$$ \gcd(a^M-1,n), $$

where $M$ is divisible by many small prime powers.

The method works because Fermat’s theorem gives

$$ a^{p-1}\equiv1\pmod p. $$

If $p-1\mid M$, then

$$ a^M\equiv1\pmod p. $$

Thus $p$ divides $a^M-1$, and a greatest common divisor may expose it.

This method is powerful when a prime factor $p$ has smooth $p-1$, but unreliable otherwise.

Smooth Numbers

A positive integer is called $B$-smooth if all of its prime factors are at most $B$.

Smooth numbers play a major role in modern factorization.

Many algorithms try to construct congruences involving smooth numbers, because smooth integers can be factored easily over a fixed factor base.

The probability that a random integer near $x$ is $B$-smooth governs the efficiency of several major factorization methods.

Congruence of Squares

Many factorization algorithms are based on the following idea.

If one can find integers $x$ and $y$ such that

$$ x^2\equiv y^2\pmod n, $$

but

$$ x\not\equiv \pm y\pmod n, $$

then

$$ n\mid (x-y)(x+y). $$

Therefore

$$ \gcd(x-y,n) $$

may produce a nontrivial factor.

This is the central mechanism behind the quadratic sieve and the number field sieve.

Dixon’s Method

Dixon’s method searches for random integers $x$ such that

$$ x^2 \bmod n $$

is smooth over a chosen factor base.

After collecting enough smooth relations, linear algebra over

$$ \mathbb{F}_2 $$

finds a subset whose product is a square.

This yields a congruence of squares modulo $n$.

Dixon’s method is conceptually important because it shows how smoothness and linear algebra combine to produce factors.

Quadratic Sieve

The quadratic sieve improves Dixon’s method by selecting values near $\sqrt{n}$.

One considers

$$ Q(x)=x^2-n. $$

If $Q(x)$ is smooth, then

$$ x^2\equiv Q(x)\pmod n. $$

By collecting many smooth values of $Q(x)$, one obtains enough relations to build a congruence of squares.

The quadratic sieve was historically the fastest general-purpose factorization algorithm for large integers before the number field sieve became dominant.

It remains effective for integers of moderate size.

Number Field Sieve

The number field sieve is the fastest known classical general-purpose algorithm for factoring large integers.

Its central idea is to construct smooth relations simultaneously in two rings:

the ordinary integer ring,
a suitable algebraic number field.

These relations are combined through large sparse linear algebra to produce a congruence of squares modulo $n$.

The number field sieve has subexponential complexity and is the method used for the largest publicly known factorizations of RSA-type integers.

Its complexity is often written informally as

$$ L_n\left[1/3,\left(\frac{64}{9}\right)^{1/3}\right], $$

where

$$ L_n[\alpha,c] = \exp\left( (c+o(1))(\log n)^\alpha(\log\log n)^{1-\alpha} \right). $$

This notation measures subexponential running time between polynomial and exponential growth.

Elliptic Curve Method

The elliptic curve method, usually called ECM, is especially effective for finding medium-sized prime factors.

It generalizes Pollard’s $p-1$ method by replacing the multiplicative group modulo $p$ with the group of points on a random elliptic curve over $\mathbb{F}_p$.

The group order varies with the curve. By trying many curves, one increases the chance that the group order is smooth.

ECM is highly effective in practice for discovering factors that are not too large, even when the full integer is enormous.

Special-Purpose Algorithms

Some integers have special forms that allow faster factorization.

Examples include:

$$ a^n-b^n, $$

$$ 2^n-1, $$

and numbers arising from algebraic sequences.

Special structure may reveal algebraic factors before general-purpose methods are needed.

For instance,

$$ a^2-b^2=(a-b)(a+b). $$

Recognizing such identities can dramatically simplify a factorization problem.

Factorization and Cryptography

The security of RSA depends on the presumed difficulty of factoring large semiprimes

$$ N=pq, $$

where $p$ and $q$ are large secret primes.

If an attacker can factor $N$, then the private key can be recovered.

This makes integer factorization one of the most important computational problems in applied number theory.

The practical difficulty of factoring large RSA moduli rests on the absence of known efficient classical algorithms for general factorization.

Quantum Factorization

Quantum computation changes the landscape.

Shor’s algorithm factors integers in polynomial time on a sufficiently large fault-tolerant quantum computer.

This does not currently make RSA instantly obsolete in all deployed settings, because building such quantum machines at cryptographic scale remains a major engineering challenge. But theoretically, it shows that factorization is easy in the quantum model.

This is one reason post-quantum cryptography studies systems based on problems other than integer factorization and discrete logarithms.

Factoring Versus Primality Testing

It is important to distinguish factorization from primality testing.

Primality testing asks whether $n$ is prime.

Factorization asks for the complete prime decomposition of $n$.

The first problem is known to be solvable in deterministic polynomial time. The second has no known polynomial-time classical algorithm.

Thus an integer can be easy to recognize as composite while still being hard to factor.

Practical Factorization Strategy

A practical factorization system usually combines several methods:

trial division to remove small factors,
Pollard rho for small to medium factors,
ECM for medium factors,
quadratic sieve for moderate composite numbers,
number field sieve for large difficult composites.

The best method depends on the size of $n$, the size of its unknown factors, and any special algebraic structure.

Conceptual Importance

Integer factorization lies at the intersection of elementary arithmetic and deep computation.

The problem begins with the fundamental theorem of arithmetic, but its efficient solution involves:

modular arithmetic,
smooth numbers,
elliptic curves,
algebraic number fields,
sparse linear algebra,
analytic estimates,
computational complexity.

Factorization illustrates a recurring pattern in modern number theory: simple questions about integers can require powerful structures from many areas of mathematics.