# Most Significant Byte Sort

# Most Significant Byte Sort

Most Significant Byte sort is an MSD radix sort that processes fixed-width keys one byte at a time, starting from the most significant byte. It partitions the array into byte buckets and recursively sorts each bucket using the next byte.

It is effective for large datasets and for keys where higher-order bytes quickly distinguish values.

## Problem

Given an array $A$ of $n$ fixed-width keys (such as 32-bit or 64-bit integers), sort the array in increasing order.

Each key can be written as:

$$
x = d_{m-1} d_{m-2} \cdots d_1 d_0
$$

where each $d_i$ is a byte:

$$
0 \le d_i < 256
$$

## Idea

At byte position $p$:

1. count how many elements have each byte value
2. compute starting offsets for each byte
3. distribute elements into buckets
4. recursively sort each bucket with byte $p - 1$

Unlike LSD radix sort, this method divides the problem early based on high-order information.

## Algorithm

```text id="k3q7xp"
msb_sort(A, lo, hi, byte):
    if hi - lo <= 1 or byte < 0:
        return

    count = [0] * 256

    for i from lo to hi - 1:
        b = byte_at(A[i], byte)
        count[b] += 1

    start = [0] * 256
    start[0] = lo

    for i from 1 to 255:
        start[i] = start[i - 1] + count[i - 1]

    aux = copy of A[lo:hi]

    for x in aux:
        b = byte_at(x, byte)
        A[start[b]] = x
        start[b] += 1

    for i from 0 to 255:
        begin = start[i] - count[i]
        end = start[i]
        if end - begin > 1:
            msb_sort(A, begin, end, byte - 1)
```

## Example

Sort:

$$
[102, 15, 230, 78, 54, 12, 89]
$$

Using byte representation, the first pass groups by the most significant byte. Buckets are then recursively sorted by the next byte.

Final result:

$$
[12, 15, 54, 78, 89, 102, 230]
$$

## Correctness

At each level, all elements are partitioned into buckets based on the current byte. All elements in a lower byte bucket are smaller than those in a higher byte bucket.

Recursive sorting orders elements within each bucket using remaining bytes. Since all bytes are processed in decreasing significance, the final ordering matches the full key order.

## Complexity

Let:

* $n$ be number of elements
* $m$ be number of bytes

Time:

$$
O(n \cdot m)
$$

in typical cases

Worst case:

$$
O(n \cdot m)
$$

when many keys share long prefixes

Space:

$$
O(n + 256)
$$

due to auxiliary array and count arrays

## Properties

| property         | value |
| ---------------- | ----- |
| stable           | yes   |
| in place         | no    |
| comparison based | no    |
| direction        | MSD   |
| radix            | 256   |

## When to Use

Use MSB sort when:

* sorting large arrays of fixed-width integers
* high-order bytes distinguish elements early
* recursive partitioning reduces work

Avoid when:

* keys are short and LSD is simpler
* stability is not required and in-place methods are preferred
* recursion overhead is undesirable

## Implementation (Python)

```python id="d9k2pa"
def msb_sort(a):
    if not a:
        return a

    max_bits = max(a).bit_length()
    bytes_count = (max_bits + 7) // 8

    def byte_at(x, b):
        return (x >> (8 * b)) & 255

    def sort(lo, hi, b):
        if hi - lo <= 1 or b < 0:
            return

        count = [0] * 256

        for i in range(lo, hi):
            count[byte_at(a[i], b)] += 1

        start = [0] * 256
        start[0] = lo

        for i in range(1, 256):
            start[i] = start[i - 1] + count[i - 1]

        aux = a[lo:hi]

        for x in aux:
            idx = byte_at(x, b)
            a[start[idx]] = x
            start[idx] += 1

        for i in range(256):
            begin = start[i] - count[i]
            end = start[i]
            if end - begin > 1:
                sort(begin, end, b - 1)

    sort(0, len(a), bytes_count - 1)
    return a
```

