# Multikey Quicksort

# Multikey Quicksort

Multikey quicksort is a string sorting algorithm that combines quicksort partitioning with radix-style character inspection. Instead of comparing whole strings each time, it partitions strings by the character at a current depth.

It is especially effective for strings with shared prefixes because it avoids repeatedly comparing characters that are already known to be equal.

## Problem

Given an array $A$ of strings, sort the strings in lexicographic order.

For a string $s$, let:

$$
s[d]
$$

be the character at position $d$. If $d$ reaches the end of the string, use a sentinel value smaller than every real character.

## Idea

At depth $d$, choose a pivot character from one string:

$$
v = s[d]
$$

Partition the array into three parts:

| region | condition                            |
| ------ | ------------------------------------ |
| left   | character at $d$ is less than $v$    |
| middle | character at $d$ equals $v$          |
| right  | character at $d$ is greater than $v$ |

Then:

* sort the left region at the same depth
* sort the middle region at depth $d + 1$
* sort the right region at the same depth

## Algorithm

```text id="x2p8qm"
multikey_quicksort(A, lo, hi, d):
    if hi - lo <= 1:
        return

    pivot = char_at(A[lo], d)

    lt = lo
    i = lo + 1
    gt = hi - 1

    while i <= gt:
        c = char_at(A[i], d)

        if c < pivot:
            swap A[lt], A[i]
            lt += 1
            i += 1
        else if c > pivot:
            swap A[i], A[gt]
            gt -= 1
        else:
            i += 1

    multikey_quicksort(A, lo, lt, d)

    if pivot is not sentinel:
        multikey_quicksort(A, lt, gt + 1, d + 1)

    multikey_quicksort(A, gt + 1, hi, d)
```

## Example

Sort:

$$
["car", "cat", "dog", "cart"]
$$

At depth 0, pivot character is `c`.

| region           | strings        |
| ---------------- | -------------- |
| equal to `c`     | car, cat, cart |
| greater than `c` | dog            |

The middle region is recursively sorted at depth 1. Later, at depth 2, `car` and `cart` share the same prefix. The sentinel places `car` before `cart`.

Final result:

$$
["car", "cart", "cat", "dog"]
$$

## Correctness

At each recursive call, the partition step separates strings by their character at depth $d$. Strings in the left region are lexicographically smaller than strings in the middle and right regions at the first differing character. Strings in the right region are lexicographically larger.

Strings in the middle region share the same character at depth $d$, so their order depends on later characters. Recursing on depth $d + 1$ correctly sorts that region. By induction over subarray size and depth, the full array becomes lexicographically sorted.

## Complexity

Let:

* $n$ be the number of strings
* $L$ be the average string length

Expected time:

$$
O(nL)
$$

Worst case:

$$
O(n^2 + nL)
$$

with poor pivot choices.

Auxiliary space:

$$
O(\log n + L)
$$

expected recursion stack, depending on partition balance.

## Properties

| property         | value         |
| ---------------- | ------------- |
| stable           | no            |
| in place         | yes           |
| comparison based | hybrid        |
| direction        | left to right |
| best for         | string keys   |

## When to Use

Use multikey quicksort when sorting strings with common prefixes and when in-place operation is useful. It is a strong general-purpose string sort because it combines radix awareness with quicksort-style partitioning.

Avoid it when stability is required or when adversarial inputs can cause poor pivot behavior unless randomized or median-based pivots are used.

## Implementation

```python id="q7m2za"
def multikey_quicksort(a):
    def char_at(s, d):
        if d == len(s):
            return -1
        return ord(s[d])

    def sort(lo, hi, d):
        if hi - lo <= 1:
            return

        pivot = char_at(a[lo], d)
        lt = lo
        i = lo + 1
        gt = hi - 1

        while i <= gt:
            c = char_at(a[i], d)

            if c < pivot:
                a[lt], a[i] = a[i], a[lt]
                lt += 1
                i += 1
            elif c > pivot:
                a[i], a[gt] = a[gt], a[i]
                gt -= 1
            else:
                i += 1

        sort(lo, lt, d)

        if pivot >= 0:
            sort(lt, gt + 1, d + 1)

        sort(gt + 1, hi, d)

    sort(0, len(a), 0)
    return a
```

