# Adaptive Samplesort

# Adaptive Samplesort

Adaptive samplesort is a samplesort variant that chooses splitters from the input and adjusts partitioning to the observed distribution. It is useful when the input has skew, duplicates, or uneven key ranges.

You use it when ordinary quicksort or fixed bucket sorting may create badly imbalanced partitions.

## Problem

Given an array ( A ) of length ( n ), sort it by dividing values into buckets whose boundaries are chosen from a sample of the input.

The goal is to make buckets roughly balanced, even when the key distribution is not uniform.

## Idea

Take a sample from the input, sort the sample, and choose splitters from it. The splitters define buckets. Each element is assigned to a bucket, and each bucket is sorted recursively or with another local method.

Adaptive versions refine this by:

* oversampling when skew is likely
* handling many equal keys specially
* changing bucket count by input size
* falling back when partitions become imbalanced

## Algorithm

```id="k8p2z5"
adaptive_samplesort(A):
    if length(A) <= small_limit:
        insertion_sort(A)
        return A

    sample = choose_sample(A)
    sort(sample)

    splitters = choose_splitters(sample)
    buckets = create_empty_buckets(length(splitters) + 1)

    for x in A:
        b = find_bucket(splitters, x)
        buckets[b].append(x)

    for each bucket in buckets:
        if bucket is too_large_or_skewed:
            adaptive_samplesort(bucket)
        else:
            local_sort(bucket)

    return concatenate(buckets)
```

## Example

Input:

[
A = [42, 3, 9, 42, 7, 100, 42, 5, 1, 88]
]

A sample might produce splitters:

[
[7, 42]
]

Buckets:

| bucket | condition        | values                |
| ------ | ---------------- | --------------------- |
| 1      | ( x < 7 )        | [3, 5, 1]             |
| 2      | ( 7 \le x < 42 ) | [9, 7]                |
| 3      | ( x \ge 42 )     | [42, 100, 42, 42, 88] |

Each bucket is sorted, then concatenated.

## Correctness

Every splitter defines an ordered boundary. All elements in an earlier bucket are less than or equal to all elements in later buckets, assuming bucket assignment uses consistent boundary rules.

If each bucket is sorted correctly, concatenating the buckets produces the full sorted order.

## Complexity

| case                                | time                                                |
| ----------------------------------- | --------------------------------------------------- |
| balanced buckets                    | ( O(n \log n) )                                     |
| good distribution with many buckets | close to ( O(n) ) partition work plus local sorting |
| worst case without fallback         | can degrade                                         |
| with fallback                       | ( O(n \log n) )                                     |

Space depends on bucket storage:

[
O(n)
]

## Handling Duplicates

Many duplicate keys can overload one bucket. Adaptive samplesort often treats splitter-equal values as a separate equal bucket.

```id="q5m9t2"
if x < splitter:
    put in left bucket
else if x == splitter:
    put in equal bucket
else:
    put in right bucket
```

This avoids repeatedly sorting identical values.

## When to Use

Use adaptive samplesort when:

* keys are unevenly distributed
* duplicate keys are common
* parallel partitioning is useful
* cache-friendly bucket processing matters

It is common in high-performance sorting systems where partition balance is more important than simple implementation.

