Skip to content

Bucket Sort

Distribute elements into buckets based on value ranges and sort each bucket independently.

Bucket sort partitions elements into a set of buckets based on their value, then sorts each bucket separately and concatenates the results.

It is effective when input values are uniformly distributed over a known range.

Problem

Given an array AA of nn elements drawn from a range [a,b)[a, b), sort the elements.

Idea

  1. Divide the range into kk equal sized intervals
  2. Place each element into its corresponding bucket
  3. Sort each bucket
  4. Concatenate buckets in order

Each bucket holds elements that fall within a subrange.

Algorithm

bucket_sort(A, k):
    buckets = [empty list for _ in range(k)]

    for x in A:
        i = bucket_index(x, k)
        buckets[i].append(x)

    for i from 0 to k - 1:
        sort(buckets[i])

    return concatenate(buckets)

A common mapping for normalized values x[0,1)x \in [0,1) is:

i=kx i = \lfloor k \cdot x \rfloor

Example

Sort:

A=[0.78,0.17,0.39,0.26,0.72,0.94,0.21,0.12,0.23,0.68] A = [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68]

Using k=5k = 5 buckets:

bucketrangeelements
0[0.0, 0.2)0.17, 0.12
1[0.2, 0.4)0.39, 0.26, 0.21, 0.23
2[0.4, 0.6)
3[0.6, 0.8)0.78, 0.72, 0.68
4[0.8, 1.0)0.94

Sort each bucket:

[0.12,0.17], [0.21,0.23,0.26,0.39], [], [0.68,0.72,0.78], [0.94] [0.12, 0.17],\ [0.21, 0.23, 0.26, 0.39],\ [],\ [0.68, 0.72, 0.78],\ [0.94]

Concatenate:

[0.12,0.17,0.21,0.23,0.26,0.39,0.68,0.72,0.78,0.94] [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.68, 0.72, 0.78, 0.94]

Correctness

Elements in earlier buckets are smaller than elements in later buckets by construction. Sorting each bucket ensures internal order. Concatenation yields a globally sorted array.

Complexity

Let:

  • nn be number of elements
  • kk be number of buckets

Expected time:

O(n+k) O(n + k)

if elements distribute evenly.

Worst case:

O(n2) O(n^2)

when all elements fall into one bucket and the inner sort dominates.

Space:

O(n+k) O(n + k)

Properties

propertyvalue
stabledepends on inner sort
in placeno
comparison basedpartly
adaptiveyes, with distribution

When to Use

Use bucket sort when:

  • input is uniformly distributed
  • range is known
  • approximate distribution is predictable
  • expected linear time is desired

Avoid when data is highly skewed or clustered.

Implementation (Python)

def bucket_sort(a, k=10):
    buckets = [[] for _ in range(k)]

    for x in a:
        i = int(k * x)
        if i == k:
            i = k - 1
        buckets[i].append(x)

    result = []
    for b in buckets:
        result.extend(sorted(b))

    return result