Skip to content

Three Way String Quicksort

String sorting algorithm that uses three-way partitioning on characters to efficiently handle repeated prefixes.

Three way string quicksort is a refinement of multikey quicksort that uses three-way partitioning to efficiently handle repeated characters. It splits the array into less than, equal to, and greater than partitions based on the character at a given position.

It is one of the most practical general-purpose string sorting algorithms.

Problem

Given an array AA of strings, sort them in lexicographic order.

For string ss, define:

s[d] s[d]

as the character at position dd. If dd exceeds the string length, use a sentinel value smaller than all characters.

Idea

At depth dd:

  • pick pivot character v=A[lo][d]v = A[lo][d]
  • partition into three regions:
regioncondition
leftcharacter < vv
middlecharacter = vv
rightcharacter > vv

Then recursively:

  • sort left region at same depth
  • sort middle region at depth d+1d + 1
  • sort right region at same depth

This reduces redundant comparisons when many strings share prefixes.

Algorithm

three_way_string_quicksort(A, lo, hi, d):
    if hi - lo <= 1:
        return

    pivot = char_at(A[lo], d)

    lt = lo
    i = lo + 1
    gt = hi - 1

    while i <= gt:
        c = char_at(A[i], d)

        if c < pivot:
            swap A[lt], A[i]
            lt += 1
            i += 1
        else if c > pivot:
            swap A[i], A[gt]
            gt -= 1
        else:
            i += 1

    three_way_string_quicksort(A, lo, lt, d)

    if pivot is not sentinel:
        three_way_string_quicksort(A, lt, gt + 1, d + 1)

    three_way_string_quicksort(A, gt + 1, hi, d)

Example

Sort:

["she","sells","sea","shells","by","the"] ["she", "sells", "sea", "shells", "by", "the"]

At depth 0:

regionstrings
bby
sshe, sells, sea, shells
tthe

The s group is recursively sorted at depth 1. Shared prefixes like sh and se are handled efficiently.

Final result:

["by","sea","sells","she","shells","the"] ["by", "sea", "sells", "she", "shells", "the"]

Correctness

At each recursion level, partitioning ensures that all strings in the left region are lexicographically smaller than those in the middle and right regions at the current character position.

Strings in the middle region share the same character, so their order depends on the next position. Recursive sorting at depth d+1d + 1 resolves this. By induction on subarray size and depth, the algorithm produces a lexicographically sorted array.

Complexity

Let:

  • nn be the number of strings
  • LL be the average string length

Expected time:

O(nL) O(nL)

Worst case:

O(n2) O(n^2)

for poor pivot choices.

Space:

O(logn+L) O(\log n + L)

due to recursion.

Properties

propertyvalue
stableno
in placeyes
comparison basedhybrid
efficient for prefixesyes

When to Use

Use three way string quicksort when:

  • sorting strings with many shared prefixes
  • memory usage must remain low
  • performance should be close to linear

Avoid when:

  • stability is required
  • worst case guarantees are critical
  • input is adversarial without randomization

Implementation

def three_way_string_quicksort(a):
    def char_at(s, d):
        if d == len(s):
            return -1
        return ord(s[d])

    def sort(lo, hi, d):
        if hi - lo <= 1:
            return

        pivot = char_at(a[lo], d)
        lt, i, gt = lo, lo + 1, hi - 1

        while i <= gt:
            c = char_at(a[i], d)

            if c < pivot:
                a[lt], a[i] = a[i], a[lt]
                lt += 1
                i += 1
            elif c > pivot:
                a[i], a[gt] = a[gt], a[i]
                gt -= 1
            else:
                i += 1

        sort(lo, lt, d)

        if pivot >= 0:
            sort(lt, gt + 1, d + 1)

        sort(gt + 1, hi, d)

    sort(0, len(a), 0)
    return a