# Block Merge Sort

# Block Merge Sort

Block merge sort is a stable merge sort variant designed to reduce auxiliary memory while keeping good practical performance. Instead of copying the whole array into a second array, it divides sorted ranges into blocks and rearranges those blocks before finishing with local merges.

The algorithm is more complex than ordinary merge sort, but it gives an important tradeoff: stable sorting with less extra memory.

## Problem

Given an array $A$ of length $n$, reorder it so that:

$$
A[0] \le A[1] \le \dots \le A[n-1]
$$

while preserving the relative order of equal elements and using less than $O(n)$ auxiliary space.

## Idea

Standard merge sort merges two sorted ranges by using a temporary array. Block merge sort reduces that storage by working with blocks.

A typical design uses:

| component       | purpose                                                |
| --------------- | ------------------------------------------------------ |
| blocks          | divide sorted ranges into manageable chunks            |
| internal buffer | store a small number of elements from the array itself |
| block selection | place blocks in approximate sorted order               |
| local merge     | finish ordering inside block boundaries                |

The exact implementation varies, but the common principle is to avoid a full size auxiliary array.

## Algorithm

```text id="nkq2z7"
block_merge_sort(A):
    sort small runs directly

    size = initial_run_size

    while size < length(A):
        for each adjacent pair of sorted ranges:
            block_merge(A, left, mid, right)

        size = size * 2
```

The merge procedure is the main part.

```text id="5v1rai"
block_merge(A, left, mid, right):
    choose block size B

    split A[left..mid] and A[mid+1..right] into blocks

    use a small buffer to move blocks into approximate order

    merge elements locally near block boundaries

    restore the buffer region
```

## Example

Let two sorted ranges be:

$$
[1, 4, 7, 10] \quad [2, 3, 8, 9]
$$

Using block size $2$:

| block | values  |
| ----- | ------- |
| L1    | [1, 4]  |
| L2    | [7, 10] |
| R1    | [2, 3]  |
| R2    | [8, 9]  |

The blocks are rearranged approximately by their first elements:

$$
[1,4], [2,3], [7,10], [8,9]
$$

Then local merging inside neighboring block regions produces:

$$
[1, 2, 3, 4, 7, 8, 9, 10]
$$

## Correctness

Block merge sort keeps the same high level invariant as merge sort: before every merge, the two input ranges are sorted. The block rearrangement step moves groups of elements into an order that narrows where each element can belong. The local merge step then restores exact element order.

Because the final merge compares actual elements and uses stable tie handling, each merged range is sorted and stable. By induction over merge levels, the full array becomes sorted.

## Complexity

| metric           | typical value              |
| ---------------- | -------------------------- |
| time             | $O(n \log n)$              |
| extra space      | often $O(\sqrt n)$ or less |
| stable           | yes                        |
| in-place variant | possible, but complex      |

Some theoretical versions achieve stable in-place merging with $O(1)$ extra words, but the implementations are intricate and have larger constants.

## Properties

| property                  | value                                   |
| ------------------------- | --------------------------------------- |
| stable                    | yes                                     |
| comparison based          | yes                                     |
| adaptive                  | sometimes                               |
| implementation difficulty | high                                    |
| practical use             | specialized stable sort implementations |

## Notes

Block merge sort is useful when stability matters and memory must be lower than a full auxiliary array. In ordinary application code, standard merge sort or Timsort is often preferred because they are simpler and faster on typical data.

Block merge sort mainly matters as a systems and library sorting technique.

## Implementation

This simplified implementation demonstrates block style merging with a fixed buffer. It keeps the main idea visible, but it is not a fully optimized production block merge sort.

```python id="ekp0sw"
def block_merge_sort(a):
    n = len(a)
    if n <= 1:
        return a

    temp = []

    def merge_with_buffer(l, mid, r):
        temp.clear()

        i = l
        j = mid + 1

        while i <= mid and j <= r:
            if a[i] <= a[j]:
                temp.append(a[i])
                i += 1
            else:
                temp.append(a[j])
                j += 1

        while i <= mid:
            temp.append(a[i])
            i += 1

        while j <= r:
            temp.append(a[j])
            j += 1

        for k, value in enumerate(temp):
            a[l + k] = value

    size = 1
    while size < n:
        for l in range(0, n, 2 * size):
            mid = min(l + size - 1, n - 1)
            r = min(l + 2 * size - 1, n - 1)
            if mid < r:
                merge_with_buffer(l, mid, r)
        size *= 2

    return a
```

```go id="f9p2vx"
func BlockMergeSort(a []int) {
	n := len(a)
	if n <= 1 {
		return
	}

	temp := make([]int, 0, n)

	mergeWithBuffer := func(l, mid, r int) {
		temp = temp[:0]

		i, j := l, mid+1

		for i <= mid && j <= r {
			if a[i] <= a[j] {
				temp = append(temp, a[i])
				i++
			} else {
				temp = append(temp, a[j])
				j++
			}
		}

		for i <= mid {
			temp = append(temp, a[i])
			i++
		}

		for j <= r {
			temp = append(temp, a[j])
			j++
		}

		for k, v := range temp {
			a[l+k] = v
		}
	}

	for size := 1; size < n; size *= 2 {
		for l := 0; l < n; l += 2 * size {
			mid := min(l+size-1, n-1)
			r := min(l+2*size-1, n-1)

			if mid < r {
				mergeWithBuffer(l, mid, r)
			}
		}
	}
}
```

