Counting sort variant that stores counts only for keys that appear, avoiding dense arrays over large key ranges.
Sparse key counting sort handles large key ranges by storing counts only for keys that appear in the input. It replaces the dense count array of counting sort with a map from key to frequency.
This is useful when the key universe is huge but the number of distinct keys is small.
Problem
Given an array of keys drawn from a large domain, sort when only distinct keys appear and or .
Idea
Use a dictionary to count frequencies:
Then sort the distinct keys and emit each key according to its count.
Algorithm
sparse_key_counting_sort(A):
H = empty map
for x in A:
H[x] += 1
keys = sorted keys of H
i = 0
for key in keys:
repeat H[key] times:
A[i] = key
i += 1
return AExample
Let:
Frequency map:
| key | count |
|---|---|
| 5 | 2 |
| 42 | 1 |
| 1000000 | 2 |
Sorted keys:
Output:
Correctness
The frequency map records the exact multiplicity of each key. Sorting the distinct keys gives the correct key order. Emitting each key exactly its recorded number of times produces a sorted array containing exactly the original multiset.
Complexity
Let:
- be the number of elements
- be the number of distinct keys
Counting time:
Sorting distinct keys:
Total time:
Space:
Comparison with Dense Counting Sort
| method | best domain | time | space |
|---|---|---|---|
| dense counting sort | small contiguous range | ||
| sparse key counting sort | large sparse range |
Sparse key counting sort is better when is large and is small.
Properties
| property | value |
|---|---|
| stable | no |
| in place | mostly, excluding map |
| comparison based | only among distinct keys |
| handles sparse keys | yes |
When to Use
Use sparse key counting sort when:
- the key range is large
- few distinct keys appear
- counting frequencies is natural
- dense counting arrays would waste memory
Avoid it when all or most keys are distinct, since sorting distinct keys gives the same asymptotic cost as ordinary comparison sort.
Implementation
from collections import Counter
def sparse_key_counting_sort(a):
count = Counter(a)
i = 0
for key in sorted(count):
for _ in range(count[key]):
a[i] = key
i += 1
return a