15.6 Karatsuba Multiplication
You need to multiply very large integers.
15.6 Karatsuba Multiplication
Problem
You need to multiply very large integers. The numbers may contain hundreds, thousands, or millions of digits.
The grade-school multiplication algorithm works, but it performs one multiplication for every pair of digits. For two (n)-digit numbers, that gives:
$$ O(n^2) $$
single-digit multiplications.
For small numbers, this is acceptable. For large integers used in symbolic computation, cryptography, computer algebra, or arbitrary-precision arithmetic, quadratic multiplication becomes expensive.
Can divide and conquer reduce the number of recursive multiplications?
Solution
Use Karatsuba multiplication.
Karatsuba's algorithm splits each number into high and low halves. A direct divide-and-conquer method would use four recursive multiplications. Karatsuba observes that the same result can be recovered using only three.
That one saved multiplication changes the recurrence from:
$$ T(n)=4T(n/2)+O(n) $$
to:
$$ T(n)=3T(n/2)+O(n) $$
The improvement looks small, but the asymptotic result changes from quadratic to approximately:
$$ O(n^{1.585}) $$
Splitting the Numbers
Let (x) and (y) be two large integers.
Choose a base (B), usually a power of 10 or a machine-word base, and split each number around the middle:
$$ x = aB^m + b $$
$$ y = cB^m + d $$
Here:
- (a) is the high half of (x)
- (b) is the low half of (x)
- (c) is the high half of (y)
- (d) is the low half of (y)
- (B^m) shifts a half-number by (m) digits or limbs
For example, with decimal splitting:
x = 12345678
y = 87654321
Split after four digits:
a = 1234
b = 5678
c = 8765
d = 4321
So:
x = 1234 * 10^4 + 5678
y = 8765 * 10^4 + 4321
The Direct Divide-and-Conquer Product
Using the split form:
$$ xy = (aB^m+b)(cB^m+d) $$
Expand:
$$ xy = acB^{2m} + adB^m + bcB^m + bd $$
Group the middle terms:
$$ xy = acB^{2m} + (ad+bc)B^m + bd $$
A straightforward divide-and-conquer implementation computes:
ac
ad
bc
bd
That requires four recursive multiplications.
The recurrence is:
$$ T(n)=4T(n/2)+O(n) $$
By the Master Theorem:
$$ T(n)=O(n^2) $$
This matches the grade-school algorithm asymptotically. Splitting alone has not improved the algorithm.
Karatsuba's Trick
Karatsuba reduces the four products to three.
Compute:
$$ z_2 = ac $$
$$ z_0 = bd $$
$$ z_1 = (a+b)(c+d) $$
Now observe:
$$ z_1 = ac + ad + bc + bd $$
So:
$$ ad+bc = z_1 - z_2 - z_0 $$
This recovers the middle term without computing (ad) and (bc) separately.
The final product is:
$$ xy = z_2B^{2m} + (z_1-z_2-z_0)B^m + z_0 $$
Only three recursive multiplications are needed.
Worked Example
Multiply:
x = 1234
y = 5678
Split at two decimal digits:
a = 12
b = 34
c = 56
d = 78
Compute:
z2 = a*c = 12*56 = 672
z0 = b*d = 34*78 = 2652
z1 = (a+b)*(c+d)
= 46*134
= 6164
Middle term:
z1 - z2 - z0
= 6164 - 672 - 2652
= 2840
Recombine with (B^m = 100):
z2 * 10000 + middle * 100 + z0
= 672 * 10000 + 2840 * 100 + 2652
= 7006652
Check:
1234 * 5678 = 7006652
Implementation Sketch
The following version uses decimal splitting for clarity. Production big-integer libraries usually split by machine words or fixed-size limbs.
func Karatsuba(x, y int64) int64 {
if x < 10 || y < 10 {
return x * y
}
n := max(numDigits(x), numDigits(y))
m := n / 2
pow := pow10(m)
a := x / pow
b := x % pow
c := y / pow
d := y % pow
z2 := Karatsuba(a, c)
z0 := Karatsuba(b, d)
z1 := Karatsuba(a+b, c+d)
middle := z1 - z2 - z0
return z2*pow*pow + middle*pow + z0
}
Helper functions:
func numDigits(x int64) int {
if x == 0 {
return 1
}
count := 0
for x > 0 {
count++
x /= 10
}
return count
}
func pow10(n int) int64 {
result := int64(1)
for i := 0; i < n; i++ {
result *= 10
}
return result
}
func max(a, b int) int {
if a > b {
return a
}
return b
}
This implementation is pedagogical. It is correct for nonnegative numbers within int64 range, but it is not suitable for arbitrary-precision arithmetic.
Recurrence Analysis
Karatsuba performs:
- Three recursive multiplications on half-size inputs
- Linear work for splitting, addition, subtraction, and recombination
The recurrence is:
$$ T(n)=3T(n/2)+O(n) $$
Compute the critical term:
$$ n^{\log_2 3} $$
Since:
$$ \log_2 3 \approx 1.585 $$
we get:
$$ T(n)=O(n^{1.585}) $$
This is asymptotically faster than:
$$ O(n^2) $$
The saving comes entirely from reducing the number of recursive multiplications.
Why Three Beats Four
The difference between three and four recursive calls compounds over the recursion tree.
At depth (k):
- direct divide and conquer has (4^k) subproblems
- Karatsuba has (3^k) subproblems
At the leaf level, where (k=\log_2 n):
$$ 4^{\log_2 n}=n^2 $$
but:
$$ 3^{\log_2 n}=n^{\log_2 3} $$
This is why the algorithm improves the exponent.
Base Case Tuning
Karatsuba is not always faster for small numbers.
It performs extra additions, subtractions, and memory movement. For small inputs, ordinary grade-school multiplication is usually faster.
A practical implementation uses a cutoff:
if n <= threshold {
return gradeSchoolMultiply(x, y)
}
The threshold depends on:
- limb size
- CPU cache behavior
- allocation cost
- compiler optimizations
- representation format
High-quality big-integer libraries tune this cutoff empirically.
Handling Uneven Lengths
Real inputs rarely have exactly equal digit lengths.
Possible strategies:
- Pad the shorter number with leading zeros.
- Split using the larger input length.
- Fall back to grade-school multiplication for very uneven inputs.
For example:
x = 123456789012
y = 345
Karatsuba is usually a poor choice here because most recursive work involves zeros or tiny subproblems.
Good implementations switch algorithms based on shape, not just size.
Signed Numbers
Karatsuba is easiest to implement for nonnegative integers.
For signed integers:
- Record the sign of the result.
- Multiply absolute values.
- Reapply the sign.
negative := (x < 0) != (y < 0)
result := Karatsuba(abs(x), abs(y))
if negative {
result = -result
}
The core algorithm should avoid mixing sign logic into the recursive multiplication itself.
Arbitrary-Precision Representation
Production big integers are not stored as decimal strings.
They are stored as arrays of limbs:
[low limb, ..., high limb]
Each limb stores a chunk of the number, often using base (2^{32}) or (2^{64}).
Karatsuba then splits the limb array rather than splitting decimal digits.
Conceptually:
x = high * base^m + low
remains unchanged.
Only the representation changes.
Common Mistakes
Computing Four Products
If you compute:
ac, ad, bc, bd
you are no longer using Karatsuba.
The algorithm depends on replacing (ad+bc) with:
(a+b)(c+d) - ac - bd
Recombining with the Wrong Shift
The term (z_2) must be shifted by (2m) digits or limbs.
The middle term must be shifted by (m).
Swapping these produces plausible but incorrect results.
Ignoring Carry Management
When numbers are stored as limb arrays, addition and subtraction must normalize carries and borrows.
Karatsuba's formula is simple; the representation bookkeeping is where many bugs appear.
Using It Too Early
Karatsuba has a higher constant factor than grade-school multiplication.
A correct implementation can still be slower if the cutoff is too low.
Overflow in Pedagogical Code
The int64 implementation shown above can overflow quickly.
Use it to understand the algorithm, not as a big-integer library.
Discussion
Karatsuba multiplication is an important divide-and-conquer lesson because it shows that splitting a problem is not enough. The improvement comes from algebraic reuse. By computing ((a+b)(c+d)), the algorithm extracts the two cross terms together and avoids one recursive multiplication.
This kind of transformation appears throughout algorithm design. Expensive operations are reduced by introducing a combined expression, then subtracting the parts already known. The result is not merely cleaner algebra; it changes the shape of the recursion tree.
Karatsuba also illustrates a practical boundary between asymptotic analysis and engineering. The (O(n^{1.585})) bound makes it superior for large inputs, but real implementations must choose cutoffs, manage memory, normalize limbs, and handle uneven input sizes. The algorithm is valuable both as a theoretical breakthrough and as a concrete technique used inside arbitrary-precision arithmetic systems.