# Cache Oblivious Merge Sort

# Cache Oblivious Merge Sort

Cache oblivious merge sort is a merge sort variant designed for memory hierarchies. It improves locality without knowing the cache size, block size, or number of cache levels.

The algorithm uses recursive structure to make subproblems eventually fit into cache automatically.

## Problem

Given an array $A$ of length $n$, reorder it such that:

$$
A[0] \le A[1] \le \dots \le A[n-1]
$$

while reducing cache misses across unknown memory hierarchies.

## Idea

A cache aware algorithm uses parameters such as block size or cache size. A cache oblivious algorithm avoids such parameters.

Cache oblivious merge sort relies on divide and conquer:

* small recursive subproblems eventually fit in cache
* recursive merging improves locality
* no machine specific threshold is required for correctness

The key challenge is merging. A cache oblivious merge divides the merge problem recursively instead of scanning large regions in one flat pass.

## Algorithm

```text id="m7k2xp"
cache_oblivious_merge_sort(A, lo, hi):
    if lo >= hi:
        return

    mid = (lo + hi) // 2

    cache_oblivious_merge_sort(A, lo, mid)
    cache_oblivious_merge_sort(A, mid + 1, hi)

    cache_oblivious_merge(A, lo, mid, hi)
```

A recursive merge splits one sorted range and binary searches into the other.

```text id="q9v3tn"
cache_oblivious_merge(A, lo, mid, hi):
    if range is small:
        ordinary_merge(A, lo, mid, hi)
        return

    choose middle element from larger half
    find its rank in the other half by binary search

    recursively merge lower parts
    recursively merge upper parts
```

## Example

Let:

$$
A = [1, 4, 7, 10, 2, 3, 8, 9]
$$

The two halves are sorted:

$$
[1,4,7,10] \quad [2,3,8,9]
$$

A recursive merge may choose $7$ from the left half and locate its position in the right half:

$$
[2,3] < 7 < [8,9]
$$

This splits the merge into smaller independent merge problems.

Final result:

$$
[1,2,3,4,7,8,9,10]
$$

## Correctness

The recursive sort step produces two sorted halves. The recursive merge partitions the two halves by rank: all elements in the lower subproblem are less than or equal to all elements in the upper subproblem. Recursively merging each part and concatenating them yields a sorted range. By induction on range size, the full array is sorted.

## Complexity

| metric            | value                    |
| ----------------- | ------------------------ |
| time              | $O(n \log n)$            |
| extra space       | usually $O(n)$           |
| cache behavior    | asymptotically efficient |
| tuning parameters | none                     |

In the cache oblivious model, good implementations can achieve near optimal cache complexity for sorting.

## Properties

| property                  | value                  |
| ------------------------- | ---------------------- |
| stable                    | yes, with stable merge |
| in-place                  | usually no             |
| cache aware               | no                     |
| cache oblivious           | yes                    |
| implementation difficulty | high                   |

## Notes

Cache oblivious merge sort matters when the same algorithm must perform well across several cache levels without manual tuning.

It is more complex than ordinary merge sort. In practical code, tuned cache aware sorts or library sorts often perform better due to lower constants.

## Implementation

This implementation keeps the interface simple and uses an ordinary merge at the leaves. It illustrates the sort structure, not a fully optimized cache oblivious merge.

```python id="x4k8qp"
def cache_oblivious_merge_sort(a):
    def sort(lo, hi):
        if lo >= hi:
            return

        mid = (lo + hi) // 2
        sort(lo, mid)
        sort(mid + 1, hi)
        merge(lo, mid, hi)

    def merge(lo, mid, hi):
        left = a[lo:mid + 1]
        right = a[mid + 1:hi + 1]

        i = j = 0
        k = lo

        while i < len(left) and j < len(right):
            if left[i] <= right[j]:
                a[k] = left[i]
                i += 1
            else:
                a[k] = right[j]
                j += 1
            k += 1

        while i < len(left):
            a[k] = left[i]
            i += 1
            k += 1

        while j < len(right):
            a[k] = right[j]
            j += 1
            k += 1

    sort(0, len(a) - 1)
    return a
```

```go id="p8v2qx"
func CacheObliviousMergeSort(a []int) {
	var sort func(int, int)
	sort = func(lo, hi int) {
		if lo >= hi {
			return
		}

		mid := (lo + hi) / 2
		sort(lo, mid)
		sort(mid+1, hi)
		cacheObliviousMerge(a, lo, mid, hi)
	}

	sort(0, len(a)-1)
}

func cacheObliviousMerge(a []int, lo, mid, hi int) {
	left := append([]int(nil), a[lo:mid+1]...)
	right := append([]int(nil), a[mid+1:hi+1]...)

	i, j, k := 0, 0, lo

	for i < len(left) && j < len(right) {
		if left[i] <= right[j] {
			a[k] = left[i]
			i++
		} else {
			a[k] = right[j]
			j++
		}
		k++
	}

	for i < len(left) {
		a[k] = left[i]
		i++
		k++
	}

	for j < len(right) {
		a[k] = right[j]
		j++
		k++
	}
}
```

