# Bucket Sort

# Bucket Sort

Bucket sort partitions elements into a set of buckets based on their value, then sorts each bucket separately and concatenates the results.

It is effective when input values are uniformly distributed over a known range.

## Problem

Given an array $A$ of $n$ elements drawn from a range $[a, b)$, sort the elements.

## Idea

1. Divide the range into $k$ equal sized intervals
2. Place each element into its corresponding bucket
3. Sort each bucket
4. Concatenate buckets in order

Each bucket holds elements that fall within a subrange.

## Algorithm

```text id="k9p3dz"
bucket_sort(A, k):
    buckets = [empty list for _ in range(k)]

    for x in A:
        i = bucket_index(x, k)
        buckets[i].append(x)

    for i from 0 to k - 1:
        sort(buckets[i])

    return concatenate(buckets)
```

A common mapping for normalized values $x \in [0,1)$ is:

$$
i = \lfloor k \cdot x \rfloor
$$

## Example

Sort:

$$
A = [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]
$$

Using $k = 5$ buckets:

| bucket | range      | elements               |
| ------ | ---------- | ---------------------- |
| 0      | [0.0, 0.2) | 0.17, 0.12             |
| 1      | [0.2, 0.4) | 0.39, 0.26, 0.21, 0.23 |
| 2      | [0.4, 0.6) | —                      |
| 3      | [0.6, 0.8) | 0.78, 0.72, 0.68       |
| 4      | [0.8, 1.0) | 0.94                   |

Sort each bucket:

$$
[0.12, 0.17],\ [0.21, 0.23, 0.26, 0.39],\ [],\ [0.68, 0.72, 0.78],\ [0.94]
$$

Concatenate:

$$
[0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]
$$

## Correctness

Elements in earlier buckets are smaller than elements in later buckets by construction. Sorting each bucket ensures internal order. Concatenation yields a globally sorted array.

## Complexity

Let:

* $n$ be number of elements
* $k$ be number of buckets

Expected time:

$$
O(n + k)
$$

if elements distribute evenly.

Worst case:

$$
O(n^2)
$$

when all elements fall into one bucket and the inner sort dominates.

Space:

$$
O(n + k)
$$

## Properties

| property         | value                  |
| ---------------- | ---------------------- |
| stable           | depends on inner sort  |
| in place         | no                     |
| comparison based | partly                 |
| adaptive         | yes, with distribution |

## When to Use

Use bucket sort when:

* input is uniformly distributed
* range is known
* approximate distribution is predictable
* expected linear time is desired

Avoid when data is highly skewed or clustered.

## Implementation (Python)

```python id="v8r1ka"
def bucket_sort(a, k=10):
    buckets = [[] for _ in range(k)]

    for x in a:
        i = int(k * x)
        if i == k:
            i = k - 1
        buckets[i].append(x)

    result = []
    for b in buckets:
        result.extend(sorted(b))

    return result
```

