# Integer Sample Sort

# Integer Sample Sort

Integer sample sort is a specialization of sample sort for integer keys. It selects sample elements to estimate the distribution, chooses splitters, and partitions the array into buckets. Each bucket is then sorted recursively or with a suitable subroutine.

Because keys are integers, classification can use arithmetic or bit operations, and buckets can be balanced more precisely than in general comparison sorting.

## Problem

Given an array $A$ of $n$ integers, sort the array efficiently using sampling to guide partitioning.

## Idea

1. Select a random or stratified sample from $A$
2. Sort the sample
3. Choose splitters from the sample
4. Partition elements into buckets using splitters
5. Recursively sort each bucket

For integers, the partition step can be optimized with branchless comparisons or radix-like classification.

## Splitters

If $k$ buckets are used, choose $k - 1$ splitters:

$$
s_1 < s_2 < \cdots < s_{k-1}
$$

Buckets are defined as:

$$
B_0 = {x < s_1},\quad
B_i = {s_i \le x < s_{i+1}},\quad
B_{k-1} = {x \ge s_{k-1}}
$$

## Algorithm

```text id="x9m2qa"
integer_sample_sort(A):
    if len(A) <= threshold:
        insertion_sort(A)
        return A

    sample = pick_sample(A)
    sort(sample)

    splitters = choose_splitters(sample)

    buckets = [empty list for _ in range(k)]

    for x in A:
        i = bucket_index(x, splitters)
        buckets[i].append(x)

    result = []
    for b in buckets:
        result.extend(integer_sample_sort(b))

    return result
```

## Example

Sort:

$$
A = [91, 12, 43, 27, 65, 8, 50, 34]
$$

Sample:

$$
[12, 43, 65]
$$

Splitters:

$$
[30, 60]
$$

Buckets:

| bucket | range           | values     |
| ------ | --------------- | ---------- |
| 0      | $x < 30$        | 12, 27, 8  |
| 1      | $30 \le x < 60$ | 43, 50, 34 |
| 2      | $x \ge 60$      | 91, 65     |

Recursively sort each bucket and concatenate:

$$
[8, 12, 27, 34, 43, 50, 65, 91]
$$

## Correctness

Splitters partition the domain into ordered intervals. Every element belongs to exactly one interval. All elements in earlier buckets are less than those in later buckets. Recursive sorting ensures order within each bucket, so concatenation yields a sorted array.

## Complexity

Let:

* $n$ be the number of elements
* $k$ be the number of buckets

Expected time:

$$
O(n \log n)
$$

With well balanced buckets, depth is $O(\log_k n)$.

Worst case:

$$
O(n^2)
$$

if partitioning is highly unbalanced.

Space:

$$
O(n)
$$

for bucket storage.

## Properties

| property          | value |
| ----------------- | ----- |
| stable            | no    |
| in place          | no    |
| comparison based  | yes   |
| adaptive          | yes   |
| parallel friendly | yes   |

## When to Use

Use integer sample sort when:

* sorting large integer arrays
* parallelization is desired
* distribution is unknown but sampling can estimate it
* radix sort is not ideal due to key structure or constraints

Avoid when:

* small input sizes
* strict worst case guarantees are required
* memory usage must be minimal

## Notes

* Often combined with block partitioning for performance
* Can switch to radix or insertion sort for small buckets
* Splitter quality strongly affects performance

