Radix Sort

Radix sort sorts keys by decomposing them into smaller parts such as digits, bytes, or characters. Instead of comparing whole keys directly, it groups keys by one position at a time.

It is most useful when keys have fixed width or bounded length, such as integers, fixed length strings, byte arrays, or identifiers.

Problem

Given an array $A$ of $n$ keys, sort the keys by their full value using their digit representation.

For base $b$ , each key can be written as:

x = d_{m-1}d_{m-2}\cdots d_1d_0

where each digit satisfies:

0 \le d_i < b

Main Variants

Radix sort has two common directions.

variant	direction	typical use
LSD radix sort	least significant digit to most significant digit	fixed width integers and strings
MSD radix sort	most significant digit to least significant digit	strings, variable length keys, lexicographic sorting

LSD radix sort relies on a stable subroutine, usually stable counting sort. MSD radix sort recursively partitions by leading digits.

LSD Algorithm

radix_sort_lsd(A, digits, base):
    for p from 0 to digits - 1:
        stable_counting_sort_by_digit(A, p, base)

    return A

Here, position $0$ means the least significant digit.

Example

Sort:

[329, 457, 657, 839, 436, 720, 355]

Using decimal digits.

After sorting by ones digit:

[720, 355, 436, 457, 657, 329, 839]

After sorting by tens digit:

[720, 329, 436, 839, 355, 457, 657]

After sorting by hundreds digit:

[329, 355, 436, 457, 657, 720, 839]

The final array is sorted.

Correctness

For LSD radix sort, the stable pass on digit $p$ sorts the array by the lowest $p + 1$ digits. This holds by induction. The base case follows from sorting by the least significant digit. The induction step works because the stable sort orders by the new digit while preserving the previous order among equal new digits.

After all digit positions have been processed, the array is sorted by every digit, so it is sorted by the full key.

Complexity

Let:

$n$ be the number of keys
$m$ be the number of digit positions
$b$ be the base

Each stable counting pass costs:

O(n + b)

Total time:

O(m(n + b))

Space:

O(n + b)

For fixed width machine integers with constant $m$ and practical base $b$ , this is linear in $n$ .

When to Use

Use radix sort when:

keys are integers, bytes, strings, or fixed width records
digit extraction is cheap
the alphabet or digit range is bounded
stable sorting by individual positions is available

Avoid radix sort when keys are large objects with expensive digit extraction, highly variable encodings, or no natural bounded digit representation.

Implementation

def radix_sort(a):
    if not a:
        return a

    max_value = max(a)
    exp = 1
    base = 10

    while max_value // exp > 0:
        a = counting_sort_by_digit(a, exp, base)
        exp *= base

    return a

def counting_sort_by_digit(a, exp, base):
    count = [0] * base

    for x in a:
        digit = (x // exp) % base
        count[digit] += 1

    for i in range(1, base):
        count[i] += count[i - 1]

    out = [0] * len(a)

    for i in range(len(a) - 1, -1, -1):
        x = a[i]
        digit = (x // exp) % base
        count[digit] -= 1
        out[count[digit]] = x

    return out