Radix sort that processes fixed-width keys one byte at a time using 256 counting buckets.
Bytewise radix sort sorts fixed-width keys by processing one 8-bit byte at a time. Since each byte has 256 possible values, every pass uses a counting array of size 256.
It is a practical form of radix sort for machine integers, binary records, and fixed-size keys because byte extraction is cheap and bucket arrays fit easily in cache.
Problem
Given an array of fixed-width unsigned integers, sort the array in increasing order.
For a key with bytes, write it as:
where each byte satisfies:
Idea
Use stable counting sort on each byte position.
For LSD bytewise radix sort, process bytes from least significant to most significant:
Stability preserves the ordering established by earlier byte passes.
Algorithm
bytewise_radix_sort(A, bytes):
for p from 0 to bytes - 1:
stable_counting_sort_by_byte(A, p)
return AThe byte at position is extracted by:
Example
Sort 16-bit integers:
Byte representation:
| value | high byte | low byte |
|---|---|---|
| 513 | 2 | 1 |
| 256 | 1 | 0 |
| 258 | 1 | 2 |
| 1 | 0 | 1 |
After sorting by low byte:
After sorting by high byte:
Correctness
After byte pass , the array is sorted by the suffix of bytes:
The base case holds after the first stable counting pass. For the induction step, sorting by byte places keys in order by the new byte while preserving order by all lower bytes among equal byte values.
After all byte positions have been processed, the array is sorted by the complete key.
Complexity
Let:
- be the number of keys
- be the number of bytes per key
Each pass costs:
Total time:
For fixed-width integers, is constant, so the time is linear in .
Space:
Properties
| property | value |
|---|---|
| stable | yes |
| in place | no |
| comparison based | no |
| radix | 256 |
| typical use | fixed-width integer keys |
When to Use
Use bytewise radix sort when sorting large arrays of fixed-width unsigned integers or binary keys. It is usually faster than comparison sorting when keys are cheap to inspect and the auxiliary output buffer is acceptable.
Avoid it for small arrays, variable-length strings, or signed values unless the signed ordering is handled explicitly.
Implementation
def bytewise_radix_sort(a, byte_count=None):
if not a:
return a
if byte_count is None:
byte_count = max(1, (max(a).bit_length() + 7) // 8)
for p in range(byte_count):
count = [0] * 256
for x in a:
b = (x >> (8 * p)) & 255
count[b] += 1
for i in range(1, 256):
count[i] += count[i - 1]
out = [0] * len(a)
for i in range(len(a) - 1, -1, -1):
x = a[i]
b = (x >> (8 * p)) & 255
count[b] -= 1
out[count[b]] = x
a = out
return a