15.6 Karatsuba Multiplication

Problem

You need to multiply very large integers. The numbers may contain hundreds, thousands, or millions of digits.

The grade-school multiplication algorithm works, but it performs one multiplication for every pair of digits. For two (n)-digit numbers, that gives:

$$ O(n^2) $$

single-digit multiplications.

For small numbers, this is acceptable. For large integers used in symbolic computation, cryptography, computer algebra, or arbitrary-precision arithmetic, quadratic multiplication becomes expensive.

Can divide and conquer reduce the number of recursive multiplications?

Solution

Use Karatsuba multiplication.

Karatsuba's algorithm splits each number into high and low halves. A direct divide-and-conquer method would use four recursive multiplications. Karatsuba observes that the same result can be recovered using only three.

That one saved multiplication changes the recurrence from:

$$ T(n)=4T(n/2)+O(n) $$

to:

$$ T(n)=3T(n/2)+O(n) $$

The improvement looks small, but the asymptotic result changes from quadratic to approximately:

$$ O(n^{1.585}) $$

Splitting the Numbers

Let (x) and (y) be two large integers.

Choose a base (B), usually a power of 10 or a machine-word base, and split each number around the middle:

$$ x = aB^m + b $$

$$ y = cB^m + d $$

Here:

(a) is the high half of (x)
(b) is the low half of (x)
(c) is the high half of (y)
(d) is the low half of (y)
(B^m) shifts a half-number by (m) digits or limbs

For example, with decimal splitting:

x = 12345678
y = 87654321

Split after four digits:

a = 1234
b = 5678

c = 8765
d = 4321

So:

x = 1234 * 10^4 + 5678
y = 8765 * 10^4 + 4321

The Direct Divide-and-Conquer Product

Using the split form:

$$ xy = (aB^m+b)(cB^m+d) $$

Expand:

$$ xy = acB^{2m} + adB^m + bcB^m + bd $$

Group the middle terms:

$$ xy = acB^{2m} + (ad+bc)B^m + bd $$

A straightforward divide-and-conquer implementation computes:

ac
ad
bc
bd

That requires four recursive multiplications.

The recurrence is:

$$ T(n)=4T(n/2)+O(n) $$

By the Master Theorem:

$$ T(n)=O(n^2) $$

This matches the grade-school algorithm asymptotically. Splitting alone has not improved the algorithm.

Karatsuba's Trick

Karatsuba reduces the four products to three.

Compute:

$$ z_2 = ac $$

$$ z_0 = bd $$

$$ z_1 = (a+b)(c+d) $$

Now observe:

$$ z_1 = ac + ad + bc + bd $$

So:

$$ ad+bc = z_1 - z_2 - z_0 $$

This recovers the middle term without computing (ad) and (bc) separately.

The final product is:

$$ xy = z_2B^{2m} + (z_1-z_2-z_0)B^m + z_0 $$

Only three recursive multiplications are needed.

Worked Example

Multiply:

x = 1234
y = 5678

Split at two decimal digits:

a = 12
b = 34

c = 56
d = 78

Compute:

z2 = a*c = 12*56 = 672
z0 = b*d = 34*78 = 2652
z1 = (a+b)*(c+d)
   = 46*134
   = 6164

Middle term:

z1 - z2 - z0
= 6164 - 672 - 2652
= 2840

Recombine with (B^m = 100):

z2 * 10000 + middle * 100 + z0
= 672 * 10000 + 2840 * 100 + 2652
= 7006652

Check:

1234 * 5678 = 7006652

Implementation Sketch

The following version uses decimal splitting for clarity. Production big-integer libraries usually split by machine words or fixed-size limbs.

func Karatsuba(x, y int64) int64 {
    if x < 10 || y < 10 {
        return x * y
    }

    n := max(numDigits(x), numDigits(y))
    m := n / 2

    pow := pow10(m)

    a := x / pow
    b := x % pow
    c := y / pow
    d := y % pow

    z2 := Karatsuba(a, c)
    z0 := Karatsuba(b, d)
    z1 := Karatsuba(a+b, c+d)

    middle := z1 - z2 - z0

    return z2*pow*pow + middle*pow + z0
}

Helper functions:

func numDigits(x int64) int {
    if x == 0 {
        return 1
    }

    count := 0
    for x > 0 {
        count++
        x /= 10
    }

    return count
}

func pow10(n int) int64 {
    result := int64(1)
    for i := 0; i < n; i++ {
        result *= 10
    }
    return result
}

func max(a, b int) int {
    if a > b {
        return a
    }
    return b
}

This implementation is pedagogical. It is correct for nonnegative numbers within int64 range, but it is not suitable for arbitrary-precision arithmetic.

Recurrence Analysis

Karatsuba performs:

Three recursive multiplications on half-size inputs
Linear work for splitting, addition, subtraction, and recombination

The recurrence is:

$$ T(n)=3T(n/2)+O(n) $$

Compute the critical term:

$$ n^{\log_2 3} $$

Since:

$$ \log_2 3 \approx 1.585 $$

we get:

$$ T(n)=O(n^{1.585}) $$

This is asymptotically faster than:

$$ O(n^2) $$

The saving comes entirely from reducing the number of recursive multiplications.

Why Three Beats Four

The difference between three and four recursive calls compounds over the recursion tree.

At depth (k):

direct divide and conquer has (4^k) subproblems
Karatsuba has (3^k) subproblems

At the leaf level, where (k=\log_2 n):

$$ 4^{\log_2 n}=n^2 $$

but:

$$ 3^{\log_2 n}=n^{\log_2 3} $$

This is why the algorithm improves the exponent.

Base Case Tuning

Karatsuba is not always faster for small numbers.

It performs extra additions, subtractions, and memory movement. For small inputs, ordinary grade-school multiplication is usually faster.

A practical implementation uses a cutoff:

if n <= threshold {
    return gradeSchoolMultiply(x, y)
}

The threshold depends on:

limb size
CPU cache behavior
allocation cost
compiler optimizations
representation format

High-quality big-integer libraries tune this cutoff empirically.

Handling Uneven Lengths

Real inputs rarely have exactly equal digit lengths.

Possible strategies:

Pad the shorter number with leading zeros.
Split using the larger input length.
Fall back to grade-school multiplication for very uneven inputs.

For example:

x = 123456789012
y = 345

Karatsuba is usually a poor choice here because most recursive work involves zeros or tiny subproblems.

Good implementations switch algorithms based on shape, not just size.

Signed Numbers

Karatsuba is easiest to implement for nonnegative integers.

For signed integers:

Record the sign of the result.
Multiply absolute values.
Reapply the sign.

negative := (x < 0) != (y < 0)
result := Karatsuba(abs(x), abs(y))

if negative {
    result = -result
}

The core algorithm should avoid mixing sign logic into the recursive multiplication itself.

Arbitrary-Precision Representation

Production big integers are not stored as decimal strings.

They are stored as arrays of limbs:

[low limb, ..., high limb]

Each limb stores a chunk of the number, often using base (2^{32}) or (2^{64}).

Karatsuba then splits the limb array rather than splitting decimal digits.

Conceptually:

x = high * base^m + low

remains unchanged.

Only the representation changes.

Common Mistakes

Computing Four Products

If you compute:

ac, ad, bc, bd

you are no longer using Karatsuba.

The algorithm depends on replacing (ad+bc) with:

(a+b)(c+d) - ac - bd

Recombining with the Wrong Shift

The term (z_2) must be shifted by (2m) digits or limbs.

The middle term must be shifted by (m).

Swapping these produces plausible but incorrect results.

Ignoring Carry Management

When numbers are stored as limb arrays, addition and subtraction must normalize carries and borrows.

Karatsuba's formula is simple; the representation bookkeeping is where many bugs appear.

Using It Too Early

Karatsuba has a higher constant factor than grade-school multiplication.

A correct implementation can still be slower if the cutoff is too low.

Overflow in Pedagogical Code

The int64 implementation shown above can overflow quickly.

Use it to understand the algorithm, not as a big-integer library.

Discussion

Karatsuba multiplication is an important divide-and-conquer lesson because it shows that splitting a problem is not enough. The improvement comes from algebraic reuse. By computing ((a+b)(c+d)), the algorithm extracts the two cross terms together and avoids one recursive multiplication.

This kind of transformation appears throughout algorithm design. Expensive operations are reduced by introducing a combined expression, then subtracting the parts already known. The result is not merely cleaner algebra; it changes the shape of the recursion tree.

Karatsuba also illustrates a practical boundary between asymptotic analysis and engineering. The (O(n^{1.585})) bound makes it superior for large inputs, but real implementations must choose cutoffs, manage memory, normalize limbs, and handle uneven input sizes. The algorithm is valuable both as a theoretical breakthrough and as a concrete technique used inside arbitrary-precision arithmetic systems.