15.14 Counting Inversions
You need to measure how far an array is from being sorted.
15.14 Counting Inversions
Problem
You need to measure how far an array is from being sorted.
Given an array:
[2, 4, 1, 3, 5]
an inversion is a pair of indexes:
$$ i < j $$
such that:
$$ a[i] > a[j] $$
In this array, the inversions are:
(2, 1)
(4, 1)
(4, 3)
So the inversion count is:
3
A sorted array has zero inversions.
A reverse-sorted array has the maximum number of inversions:
$$ \frac{n(n-1)}{2} $$
A direct solution compares every pair and runs in:
$$ O(n^2) $$
Can we count inversions faster?
Solution
Use a merge-sort style divide-and-conquer algorithm.
The key observation is that inversions fall into three categories:
- Inversions entirely inside the left half
- Inversions entirely inside the right half
- Cross inversions, where the larger element is in the left half and the smaller element is in the right half
The first two categories are counted recursively.
The third category is counted during the merge step.
This gives the same recurrence as merge sort:
$$ T(n)=2T(n/2)+O(n) $$
so the running time is:
$$ O(n\log n) $$
Brute Force Baseline
The direct method is simple:
func CountInversionsBrute(nums []int) int64 {
var count int64
for i := 0; i < len(nums); i++ {
for j := i + 1; j < len(nums); j++ {
if nums[i] > nums[j] {
count++
}
}
}
return count
}
This is useful for testing, but not for large inputs.
With (n = 1,000,000), the number of pairs is roughly:
$$ 5 \times 10^{11} $$
That is too much work for ordinary use.
Divide Step
Split the array:
[2, 4, 1, 3, 5]
into:
left = [2, 4]
right = [1, 3, 5]
Recursively count inversions in each half.
Then count inversions crossing the boundary.
The crucial condition for a cross inversion is:
left element > right element
Because both halves are sorted before merging, we can count many inversions at once.
Counting During Merge
Suppose the recursive calls have sorted the halves:
left = [2, 4]
right = [1, 3, 5]
Now merge them.
Compare:
2 and 1
Since:
2 > 1
the right element 1 forms an inversion with every remaining element in the left half:
2
4
That contributes:
2 inversions
Then compare:
2 and 3
Since:
2 <= 3
there is no inversion. Take 2.
Then compare:
4 and 3
Since:
4 > 3
the right element 3 forms an inversion with every remaining element in the left half:
4
That contributes:
1 inversion
Total cross inversions:
3
Why This Works
During merge, both halves are sorted.
If:
left[i] > right[j]
then every element from:
left[i]
left[i+1]
...
left[end]
is also greater than right[j].
So instead of counting one inversion, we count:
len(left) - i
inversions immediately.
This is the entire algorithmic advantage.
Implementation
func CountInversions(nums []int) int64 {
if len(nums) <= 1 {
return 0
}
work := append([]int(nil), nums...)
buf := make([]int, len(nums))
return countSort(work, buf, 0, len(nums))
}
func countSort(a, buf []int, lo, hi int) int64 {
if hi-lo <= 1 {
return 0
}
mid := lo + (hi-lo)/2
count := countSort(a, buf, lo, mid)
count += countSort(a, buf, mid, hi)
count += mergeAndCount(a, buf, lo, mid, hi)
return count
}
func mergeAndCount(a, buf []int, lo, mid, hi int) int64 {
i := lo
j := mid
k := lo
var count int64
for i < mid && j < hi {
if a[i] <= a[j] {
buf[k] = a[i]
i++
} else {
buf[k] = a[j]
j++
count += int64(mid - i)
}
k++
}
for i < mid {
buf[k] = a[i]
i++
k++
}
for j < hi {
buf[k] = a[j]
j++
k++
}
copy(a[lo:hi], buf[lo:hi])
return count
}
This implementation preserves the input by copying it first.
If mutation is acceptable, remove the copy and sort the input in place.
Example Trace
Input:
[2, 4, 1, 3, 5]
Recursive splits:
[2, 4] [1, 3, 5]
[2] [4] [1] [3, 5]
Left half:
[2] and [4]
No inversion.
Right half:
[3] and [5]
No inversion.
Merge:
[1] and [3, 5]
No inversion.
Final merge:
[2, 4] and [1, 3, 5]
Cross inversions:
2 > 1 contributes 2
4 > 3 contributes 1
Total:
3
Handling Duplicates
Use:
if a[i] <= a[j]
not:
if a[i] < a[j]
An inversion requires:
a[i] > a[j]
Equal values are not inversions.
Choosing the left element first when equal also keeps the merge stable.
Example:
[2, 2, 1]
Inversions:
(2, 1)
(2, 1)
Count:
2
The two equal 2 values do not form an inversion with each other.
Complexity Analysis
The algorithm follows merge sort.
Each level performs one full merge pass:
$$ O(n) $$
The number of levels is:
$$ O(\log n) $$
Total time:
$$ O(n\log n) $$
Extra space:
$$ O(n) $$
The inversion count itself may be as large as:
$$ \frac{n(n-1)}{2} $$
so use a 64-bit integer.
Maximum Inversion Count
For a reverse-sorted array:
[5, 4, 3, 2, 1]
every pair is inverted.
The number of inversions is:
$$ \binom{n}{2} = \frac{n(n-1)}{2} $$
For (n=100{,}000), this is:
4,999,950,000
which does not fit in a 32-bit signed integer.
Use:
int64
for the result.
Applications
Inversion counting appears in several domains.
Ranking Distance
If two rankings contain the same items, the number of inversions measures disagreement.
Example:
ranking A: [A, B, C, D]
ranking B: [B, A, D, C]
After mapping one ranking into the order of the other, inversion counting gives a Kendall tau style distance.
Sorting Difficulty
The inversion count measures how much work insertion sort must perform.
Insertion sort runs quickly when the number of inversions is small.
Data Quality
Unexpectedly high inversion counts can reveal timestamps, sequence numbers, or version vectors that are out of order.
Competitive Programming
Many problems reduce to counting pairs:
i < j and a[i] > a[j]
or variants such as:
a[i] > 2*a[j]
The merge-sort pattern often applies.
Variant: Reverse Pairs
A common variant counts pairs:
$$ i < j $$
and:
$$ a[i] > 2a[j] $$
The merge step must count these before merging.
Pseudo-code:
j = mid
for i in left half:
while j < hi and a[i] > 2*a[j]:
j++
count += j - mid
Then perform the normal merge.
The same divide-and-conquer structure remains.
Common Mistakes
Counting Equal Values
Equal values are not inversions.
Use <= during merge to avoid counting duplicates incorrectly.
Using 32-Bit Counts
The count can exceed int32.
Use int64.
Forgetting to Merge
The recursive count is only correct if each half is sorted before returning.
Counting and sorting are coupled.
Mutating Unexpectedly
The algorithm sorts the working array.
Copy the input if callers expect the original order.
Counting Only Adjacent Swaps
An inversion is any out-of-order pair, not only neighboring elements.
Adjacent swaps are related, but the definition is global.
Discussion
Counting inversions is a model divide-and-conquer problem because the recursive split alone is not enough. The value comes from using sorted subproblems to count many cross-boundary pairs in one operation. A single comparison during merge may account for thousands or millions of inversions.
This pattern is broader than inversion counting. Many pair-counting problems can be solved by sorting one dimension and counting across the divide in another. Once the subarrays are ordered, an apparent quadratic comparison problem often becomes a linear merge scan.
The practical lesson is to look for batch counting opportunities. If a condition holds for one element and an ordered suffix, count the whole suffix at once. That is the same structural advantage that makes merge sort useful, repurposed here to compute a global statistic rather than produce only a sorted array.