Distribute elements into buckets based on value ranges and sort each bucket independently.
Bucket sort partitions elements into a set of buckets based on their value, then sorts each bucket separately and concatenates the results.
It is effective when input values are uniformly distributed over a known range.
Problem
Given an array of elements drawn from a range , sort the elements.
Idea
- Divide the range into equal sized intervals
- Place each element into its corresponding bucket
- Sort each bucket
- Concatenate buckets in order
Each bucket holds elements that fall within a subrange.
Algorithm
bucket_sort(A, k):
buckets = [empty list for _ in range(k)]
for x in A:
i = bucket_index(x, k)
buckets[i].append(x)
for i from 0 to k - 1:
sort(buckets[i])
return concatenate(buckets)A common mapping for normalized values is:
Example
Sort:
Using buckets:
| bucket | range | elements |
|---|---|---|
| 0 | [0.0, 0.2) | 0.17, 0.12 |
| 1 | [0.2, 0.4) | 0.39, 0.26, 0.21, 0.23 |
| 2 | [0.4, 0.6) | — |
| 3 | [0.6, 0.8) | 0.78, 0.72, 0.68 |
| 4 | [0.8, 1.0) | 0.94 |
Sort each bucket:
Concatenate:
Correctness
Elements in earlier buckets are smaller than elements in later buckets by construction. Sorting each bucket ensures internal order. Concatenation yields a globally sorted array.
Complexity
Let:
- be number of elements
- be number of buckets
Expected time:
if elements distribute evenly.
Worst case:
when all elements fall into one bucket and the inner sort dominates.
Space:
Properties
| property | value |
|---|---|
| stable | depends on inner sort |
| in place | no |
| comparison based | partly |
| adaptive | yes, with distribution |
When to Use
Use bucket sort when:
- input is uniformly distributed
- range is known
- approximate distribution is predictable
- expected linear time is desired
Avoid when data is highly skewed or clustered.
Implementation (Python)
def bucket_sort(a, k=10):
buckets = [[] for _ in range(k)]
for x in a:
i = int(k * x)
if i == k:
i = k - 1
buckets[i].append(x)
result = []
for b in buckets:
result.extend(sorted(b))
return result