# 6.20 Inversion Counting

# 6.20 Inversion Counting

An inversion is a pair of positions that appears in the opposite order from the final sorted order. Inversion counting measures how far an array is from being sorted. It is useful for ranking distance, sorting analysis, permutation comparison, and adaptive algorithm design.

## Problem

You have an array `A[0..n-1]`. You want to count the number of pairs `(i, j)` such that:

```text id="so4efu"
i < j and A[i] > A[j]
```

A sorted array has zero inversions. A reverse sorted array has the maximum number of inversions:

```text id="5xxml9"
n(n - 1) / 2
```

## Example

For:

```text id="xuucq6"
A = [2, 4, 1, 3]
```

The inversions are:

```text id="lopqyj"
(2, 1)
(4, 1)
(4, 3)
```

So the inversion count is:

```text id="ezkrj5"
3
```

## Brute Force Baseline

The direct method checks every pair.

```text id="d9k092"
count_inversions_bruteforce(A):
  count = 0

  for i = 0 to n - 1:
    for j = i + 1 to n - 1:
      if A[i] > A[j]:
        count = count + 1

  return count
```

This is easy to verify but slow.

```text id="trgg0s"
time:  O(n^2)
space: O(1)
```

It is useful as a test oracle for small arrays.

## Merge Sort Method

Inversions can be counted during merge sort. The key observation is that when merging two sorted halves, if an element from the right half is smaller than an element from the left half, then it forms inversions with all remaining elements in the left half.

```text id="k64ijz"
count_inversions(A):
  buffer = new array of length n
  return sort_count(A, buffer, 0, n)
```

```text id="xaq13p"
sort_count(A, buffer, lo, hi):
  if hi - lo <= 1:
    return 0

  mid = lo + (hi - lo) / 2

  left_count = sort_count(A, buffer, lo, mid)
  right_count = sort_count(A, buffer, mid, hi)
  split_count = merge_count(A, buffer, lo, mid, hi)

  return left_count + right_count + split_count
```

```text id="jf1bwy"
merge_count(A, buffer, lo, mid, hi):
  i = lo
  j = mid
  k = lo
  count = 0

  while i < mid and j < hi:
    if A[i] <= A[j]:
      buffer[k] = A[i]
      i = i + 1
    else:
      buffer[k] = A[j]
      count = count + (mid - i)
      j = j + 1

    k = k + 1

  copy remaining A[i..mid-1] into buffer
  copy remaining A[j..hi-1] into buffer
  copy buffer[lo..hi-1] back into A[lo..hi-1]

  return count
```

## Why the Merge Count Works

During merge, both halves are already sorted. Suppose:

```text id="mhsmcs"
A[i] > A[j]
```

where `i` is in the left half and `j` is in the right half. Since the left half is sorted, every element from `A[i]` through `A[mid-1]` is also greater than `A[j]`.

Therefore `A[j]` creates:

```text id="0qpeso"
mid - i
```

split inversions at once.

This avoids checking each pair separately.

## Correctness

The recursive calls count inversions entirely inside the left half and entirely inside the right half.

The merge step counts split inversions, where the first element lies in the left half and the second lies in the right half.

Every inversion belongs to exactly one of these three categories:

```text id="jblaea"
inside the left half
inside the right half
across the split
```

The categories are disjoint, so their counts can be added.

The merge step also sorts the range, which is required by the parent call. Thus the algorithm simultaneously maintains sorted ranges and accumulates inversion counts.

## Complexity

The recurrence is the same as merge sort:

```text id="f6fqy7"
T(n) = 2T(n/2) + O(n)
```

So the running time is:

```text id="xw5v96"
O(n log n)
```

The auxiliary space is:

```text id="g892f8"
O(n)
```

for the merge buffer.

## Fenwick Tree Method

Another approach uses coordinate compression and a Fenwick tree.

Scan from right to left. For each value, count how many smaller values have already appeared to its right.

```text id="rlpum3"
count_inversions_fenwick(A):
  C = coordinate_compress(A)
  tree = Fenwick(size = number of distinct values)
  count = 0

  for i = n - 1 down to 0:
    count = count + tree.sum(C[i] - 1)
    tree.add(C[i], 1)

  return count
```

Here `tree.sum(C[i] - 1)` returns the number of elements to the right that are smaller than `A[i]`.

## Handling Duplicates

Duplicates are not inversions when the definition uses `A[i] > A[j]`.

This is why the merge method uses:

```text id="6c70se"
if A[i] <= A[j]
```

When values are equal, the left element is emitted first and no inversion is counted.

In the Fenwick method, query only ranks strictly smaller than `C[i]`:

```text id="geoq02"
tree.sum(C[i] - 1)
```

Querying `tree.sum(C[i])` would incorrectly count equal values as inversions.

## Integer Size

The inversion count can be large. The maximum count is:

```text id="osf4ne"
n(n - 1) / 2
```

For `n = 100000`, this is:

```text id="8zv5vv"
4999950000
```

This exceeds a 32-bit signed integer. Use a 64-bit integer or larger.

## Implementation Notes

Use the merge sort method when you already need a sorted array or want a simple deterministic algorithm.

Use the Fenwick tree method when you need online-style counting over compressed values, or when the operation is part of a larger frequency-query algorithm.

Keep a brute force implementation for tests. Generate small random arrays and compare the optimized method against brute force.

## Common Bugs

A common bug is counting equal values as inversions. Use strict comparison.

Another bug is forgetting that merge-based counting mutates the array. Copy the input first if the original order must be preserved.

A third bug is storing the count in a 32-bit integer.

A fourth bug is using uncompressed large values as Fenwick indices.

## Takeaway

Inversion counting is a sorting-adjacent problem. Merge sort counts inversions by detecting cross-order violations during merging, while Fenwick trees count smaller elements to the right using compressed ranks.

