# Three Way String Quicksort

# Three Way String Quicksort

Three way string quicksort is a refinement of multikey quicksort that uses three-way partitioning to efficiently handle repeated characters. It splits the array into less than, equal to, and greater than partitions based on the character at a given position.

It is one of the most practical general-purpose string sorting algorithms.

## Problem

Given an array $A$ of strings, sort them in lexicographic order.

For string $s$, define:

$$
s[d]
$$

as the character at position $d$. If $d$ exceeds the string length, use a sentinel value smaller than all characters.

## Idea

At depth $d$:

* pick pivot character $v = A[lo][d]$
* partition into three regions:

| region | condition       |
| ------ | --------------- |
| left   | character < $v$ |
| middle | character = $v$ |
| right  | character > $v$ |

Then recursively:

* sort left region at same depth
* sort middle region at depth $d + 1$
* sort right region at same depth

This reduces redundant comparisons when many strings share prefixes.

## Algorithm

```text id="t9p3qx"
three_way_string_quicksort(A, lo, hi, d):
    if hi - lo <= 1:
        return

    pivot = char_at(A[lo], d)

    lt = lo
    i = lo + 1
    gt = hi - 1

    while i <= gt:
        c = char_at(A[i], d)

        if c < pivot:
            swap A[lt], A[i]
            lt += 1
            i += 1
        else if c > pivot:
            swap A[i], A[gt]
            gt -= 1
        else:
            i += 1

    three_way_string_quicksort(A, lo, lt, d)

    if pivot is not sentinel:
        three_way_string_quicksort(A, lt, gt + 1, d + 1)

    three_way_string_quicksort(A, gt + 1, hi, d)
```

## Example

Sort:

$$
["she", "sells", "sea", "shells", "by", "the"]
$$

At depth 0:

| region | strings                 |
| ------ | ----------------------- |
| b      | by                      |
| s      | she, sells, sea, shells |
| t      | the                     |

The `s` group is recursively sorted at depth 1. Shared prefixes like `sh` and `se` are handled efficiently.

Final result:

$$
["by", "sea", "sells", "she", "shells", "the"]
$$

## Correctness

At each recursion level, partitioning ensures that all strings in the left region are lexicographically smaller than those in the middle and right regions at the current character position.

Strings in the middle region share the same character, so their order depends on the next position. Recursive sorting at depth $d + 1$ resolves this. By induction on subarray size and depth, the algorithm produces a lexicographically sorted array.

## Complexity

Let:

* $n$ be the number of strings
* $L$ be the average string length

Expected time:

$$
O(nL)
$$

Worst case:

$$
O(n^2)
$$

for poor pivot choices.

Space:

$$
O(\log n + L)
$$

due to recursion.

## Properties

| property               | value  |
| ---------------------- | ------ |
| stable                 | no     |
| in place               | yes    |
| comparison based       | hybrid |
| efficient for prefixes | yes    |

## When to Use

Use three way string quicksort when:

* sorting strings with many shared prefixes
* memory usage must remain low
* performance should be close to linear

Avoid when:

* stability is required
* worst case guarantees are critical
* input is adversarial without randomization

## Implementation

```python id="m2q7vz"
def three_way_string_quicksort(a):
    def char_at(s, d):
        if d == len(s):
            return -1
        return ord(s[d])

    def sort(lo, hi, d):
        if hi - lo <= 1:
            return

        pivot = char_at(a[lo], d)
        lt, i, gt = lo, lo + 1, hi - 1

        while i <= gt:
            c = char_at(a[i], d)

            if c < pivot:
                a[lt], a[i] = a[i], a[lt]
                lt += 1
                i += 1
            elif c > pivot:
                a[i], a[gt] = a[gt], a[i]
                gt -= 1
            else:
                i += 1

        sort(lo, lt, d)

        if pivot >= 0:
            sort(lt, gt + 1, d + 1)

        sort(gt + 1, hi, d)

    sort(0, len(a), 0)
    return a
```

