Skip to content

Histogram Sort

Sort elements by building a frequency histogram and reconstructing the output from cumulative counts.

Histogram sort constructs a frequency histogram over the key domain and uses it to reconstruct the sorted sequence. It generalizes counting sort to cases where the domain may be sparse or where counts are derived from a histogram structure.

It appears in data processing pipelines where histograms are already computed or where frequency aggregation is natural.

Problem

Given an array AA of nn elements with keys drawn from a domain DD, produce a sorted array.

Idea

  1. Build a histogram mapping each key to its frequency
  2. Convert the histogram into cumulative counts
  3. Reconstruct the sorted array using counts

Unlike simple counting sort, histogram sort often uses a map or compressed structure when the key domain is large or sparse.

Algorithm

histogram_sort(A):
    H = empty map

    for x in A:
        H[x] += 1

    keys = sorted list of keys in H

    result = []

    for key in keys:
        for i from 1 to H[key]:
            append key to result

    return result

Example

Let:

A=[4,2,2,8,3,3,1] A = [4, 2, 2, 8, 3, 3, 1]

Histogram:

keycount
11
22
32
41
81

Sorted output:

[1,2,2,3,3,4,8] [1, 2, 2, 3, 3, 4, 8]

Correctness

The histogram records exact multiplicities of each key. Iterating over keys in sorted order and emitting each key according to its frequency produces a sorted sequence with the correct number of occurrences.

Complexity

Let:

  • nn be number of elements
  • uu be number of distinct keys

Time:

O(n+ulogu) O(n + u \log u)

The logu\log u factor arises from sorting the distinct keys.

Space:

O(u) O(u)

Comparison with Counting Sort

aspectcounting sorthistogram sort
domaindensesparse
storagearraymap
key iterationdirect indexsorted keys
complexityO(n+k)O(n + k)O(n+ulogu)O(n + u \log u)

Histogram sort is preferred when the domain size kk is large but the number of distinct values uu is small.

Properties

propertyvalue
stableno
in placeno
comparison basedpartially
domain handlingsparse friendly

When to Use

Use histogram sort when:

  • key domain is large or sparse
  • number of distinct values is small
  • frequency counting is already required
  • memory for dense arrays is impractical

Avoid when stability is required or when domain is dense and small, where counting sort is more efficient.

Implementation (Python)

from collections import Counter

def histogram_sort(a):
    H = Counter(a)
    result = []

    for key in sorted(H.keys()):
        result.extend([key] * H[key])

    return result