6.8 Radix Sort

Radix sort sorts keys one digit at a time. It avoids direct comparisons between full keys by repeatedly applying a stable sort to smaller key components. For integers, those components are usually digits in base 10, base 256, or another fixed radix.

Problem

You have an array of nonnegative integers. You want to sort them in linear time relative to the number of elements and the number of digits, assuming each digit comes from a small bounded range.

Solution

Process digits from least significant to most significant. At each digit position, use a stable sorting method, usually counting sort.

radix_sort(A, base):
  max_value = maximum(A)
  exp = 1

  while max_value / exp > 0:
    stable_counting_sort_by_digit(A, exp, base)
    exp = exp * base

The digit at position exp is:

digit(x, exp, base) = floor(x / exp) mod base

For base 10, the first pass sorts by ones digit, the second by tens digit, the third by hundreds digit, and so on.

Example

Input:

[170, 45, 75, 90, 802, 24, 2, 66]

Sort by ones digit:

[170, 90, 802, 2, 24, 45, 75, 66]

Sort by tens digit:

[802, 2, 24, 45, 66, 170, 75, 90]

Sort by hundreds digit:

[2, 24, 45, 66, 75, 90, 170, 802]

The final array is sorted.

Why Stability Matters

Radix sort depends on stability. After sorting by a less significant digit, later passes must preserve the ordering already established among equal higher-digit groups.

Suppose two numbers have the same tens digit. Their order after the tens pass should still reflect the ones digit order created earlier. A stable digit sort preserves that information. An unstable digit sort can destroy it, making the final result incorrect.

For least-significant-digit radix sort, the invariant after processing digit positions 0 through d is:

The array is sorted by the lowest d + 1 digits.

The next stable pass groups elements by digit d + 1 while preserving the order from the lower digits inside each group. Therefore the array becomes sorted by the lowest d + 2 digits.

Correctness

Radix sort is correct by induction on digit position.

After the first pass, the array is sorted by the least significant digit. Assume that after pass d, the array is sorted by the lowest d + 1 digits. During pass d + 1, the stable digit sort orders elements by the next digit. For elements with equal next digit, stability preserves the order created by earlier passes. Therefore the array is sorted by the lowest d + 2 digits.

After all digit positions have been processed, the array is sorted by all digits. Therefore it is sorted by numeric value.

The permutation condition holds because each digit pass is a stable sorting pass, which moves elements without creating or deleting them.

Complexity

Let:

n = number of elements
b = base
d = number of digits in the largest key, measured in base b

Each counting-sort pass costs:

O(n + b)

There are d passes, so the total running time is:

O(d(n + b))

The auxiliary space is:

O(n + b)

For fixed-width integers and fixed base, d and b can be treated as constants, giving linear time in practice:

O(n)

This statement depends on the machine model. Sorting arbitrary-length integers requires accounting for the number of digits.

Choosing a Base

The base controls the tradeoff between the number of passes and the size of the counting array.

A small base uses little memory but requires more passes:

base 10: many passes, small count array

A larger base reduces the number of passes but increases memory use:

base 256: one byte per pass
base 65536: two bytes per pass

For fixed-width machine integers, base 256 is common because it aligns with bytes and keeps the counting array small.

Handling Negative Integers

The simple version assumes nonnegative keys. To sort signed integers, use one of these approaches:

separate negative and nonnegative values
shift all values by subtracting the minimum value
reinterpret signed integers with an order-preserving transform

The shift method works when the adjusted range does not overflow. For fixed-width signed integers, careful bit-level handling is safer.

Most-Significant-Digit Radix Sort

The version above is least-significant-digit radix sort. It processes digits from right to left.

Most-significant-digit radix sort processes digits from left to right. It partitions elements by the current digit, then recursively sorts each bucket by the next digit.

msd_radix_sort(A, digit):
  if length(A) <= 1 or digit < 0:
    return

  partition A into buckets by digit
  recursively sort each bucket by digit - 1

MSD radix sort can stop early when prefixes differ enough to determine order. It is useful for strings, where not every key has the same length. However, it is more complex because it must manage variable-length keys and recursive buckets.

Implementation Notes

Use radix sort when keys have a fixed, decomposable representation and stability or linear-time behavior matters. It works well for integers, byte strings, fixed-length identifiers, and records with bounded numeric keys.

For general-purpose object sorting, comparison sorting is usually simpler and more flexible. Radix sort needs a key extraction function, a fixed digit model, and enough memory for the auxiliary array.

A practical integer implementation usually has this shape:

radix_sort_u32(A):
  for shift in [0, 8, 16, 24]:
    stable_counting_sort_by_byte(A, shift)

Each pass extracts one byte:

byte = (x >> shift) & 255

This gives four passes for 32-bit unsigned integers.

Common Bugs

A common bug is using an unstable digit sort. The result may look partly sorted after each pass but fail on inputs where lower digits must break ties.

Another bug is stopping too early. The algorithm must process all digit positions needed by the maximum key.

A third bug is using decimal digits in performance-critical code. Decimal radix is easy to explain, but byte-based radix is usually faster on machines.

A fourth bug is ignoring negative values. Standard digit extraction does not preserve numeric order for signed integers unless the representation is handled carefully.

Takeaway

Radix sort gains speed by sorting fixed-size pieces of a key rather than comparing whole keys. Its correctness depends on stable digit sorting. It is strongest when keys are bounded, decomposable, and stored in a machine-friendly representation.