Skip to content

Coordinate Compression Sort

Sort values by mapping them to a dense rank space and ordering by compressed indices.

Coordinate compression sort maps arbitrary values to a dense integer range based on their sorted order, then sorts using these compact indices. It is useful when values are large, sparse, or not directly suitable for counting or radix methods.

Problem

Given an array AA of values, possibly large integers or non-integer comparable items, sort the array efficiently by reducing the value domain.

Idea

Replace each value with its rank in the sorted order of distinct values.

Let:

V=sorted unique values of A V = \text{sorted unique values of } A

Define mapping:

rank(x)=index of x in V \text{rank}(x) = \text{index of } x \text{ in } V

Then:

  1. map each value to its rank
  2. sort by ranks
  3. optionally map back to original values

Algorithm

coordinate_compression_sort(A):
    V = sorted unique values of A

    create map from value to index

    B = array of size n

    for i from 0 to n-1:
        B[i] = map[A[i]]

    sort B

    for i from 0 to n-1:
        A[i] = V[B[i]]

    return A

Example

Let:

A=[1000,5000,1000,2000] A = [1000, 5000, 1000, 2000]

Unique sorted values:

V=[1000,2000,5000] V = [1000, 2000, 5000]

Mapping:

valuerank
10000
20001
50002

Mapped array:

[0,2,0,1] [0, 2, 0, 1]

Sorted:

[0,0,1,2] [0, 0, 1, 2]

Mapped back:

[1000,1000,2000,5000] [1000, 1000, 2000, 5000]

Correctness

The mapping preserves order:

x1<x2    rank(x1)<rank(x2) x_1 < x_2 \iff \text{rank}(x_1) < \text{rank}(x_2)

Thus sorting compressed values yields the correct order of original values.

Complexity

Let:

  • nn be number of elements
  • uu be number of distinct values

Time:

  • building VV: O(nlogn)O(n \log n)
  • mapping: O(n)O(n)
  • sorting compressed array: O(nlogn)O(n \log n)

Total:

O(nlogn) O(n \log n)

If combined with counting or radix sort after compression:

O(n+u) O(n + u)

Space:

O(n+u) O(n + u)

Properties

propertyvalue
stabledepends on underlying sort
in placeno
comparison basedpartly
handles large valuesyes

When to Use

Use coordinate compression when:

  • values are large or sparse
  • you want to apply counting or radix sort afterward
  • relative ordering matters but exact values are not needed during sorting

Avoid when:

  • values are already small integers
  • direct sorting is simpler

Implementation

def coordinate_compression_sort(a):
    if not a:
        return a

    vals = sorted(set(a))
    rank = {v: i for i, v in enumerate(vals)}

    comp = [rank[x] for x in a]
    comp.sort()

    return [vals[i] for i in comp]