Skip to content

Natural Merge Sort

Adaptive merge sort that detects existing sorted runs and merges them.

Natural merge sort improves merge sort by exploiting existing order in the input. Instead of splitting blindly, it scans the array to find already sorted runs, then merges those runs.

This approach adapts to partially sorted data and can reduce the number of passes.

Problem

Given an array AA of length nn, reorder it such that:

A[0]A[1]A[n1] A[0] \le A[1] \le \dots \le A[n-1]

Idea

A run is a maximal nondecreasing subsequence:

A[i]A[i+1]A[j] A[i] \le A[i+1] \le \dots \le A[j]

Natural merge sort first identifies such runs, then merges adjacent runs until only one remains.

Algorithm

  1. Scan the array and partition it into runs
  2. Merge adjacent runs pairwise
  3. Repeat until one run remains
natural_merge_sort(A):
    runs = []

    i = 0
    while i < n:
        j = i
        while j + 1 < n and A[j] <= A[j + 1]:
            j += 1
        runs.append((i, j))
        i = j + 1

    while length(runs) > 1:
        new_runs = []
        for k from 0 to length(runs) step 2:
            if k + 1 < length(runs):
                merged = merge(A, runs[k], runs[k+1])
                new_runs.append(merged)
            else:
                new_runs.append(runs[k])
        runs = new_runs

Example

Let:

A=[1,3,5,2,4,6] A = [1, 3, 5, 2, 4, 6]

Detected runs:

  • [1, 3, 5]
  • [2, 4, 6]

Merge:

  • [1, 3, 5] + [2, 4, 6] → [1, 2, 3, 4, 5, 6]

Only one pass is needed.

Correctness

Each run is already sorted. Merging two sorted runs produces a sorted result. Repeated merging eventually produces a single sorted run covering the entire array.

Complexity

Worst case occurs when the input has no structure:

T(n)=O(nlogn) T(n) = O(n \log n)

Best case occurs when the array is already sorted:

T(n)=O(n) T(n) = O(n)
casetime
bestO(n)O(n)
worstO(nlogn)O(n \log n)

Space:

O(n) O(n)

Properties

propertyvalue
stableyes
adaptiveyes
in-placeno
practical usefoundation of Timsort

Notes

Natural merge sort benefits from real world data, which often contains sorted segments. It reduces unnecessary work compared to fixed splitting strategies.

Modern algorithms such as Timsort extend this idea by using more sophisticated run detection and merging strategies.

Implementation

def natural_merge_sort(a):
    n = len(a)
    temp = [0] * n

    def merge(l, mid, r):
        i, j, k = l, mid + 1, l
        while i <= mid and j <= r:
            if a[i] <= a[j]:
                temp[k] = a[i]
                i += 1
            else:
                temp[k] = a[j]
                j += 1
            k += 1
        while i <= mid:
            temp[k] = a[i]
            i += 1
            k += 1
        while j <= r:
            temp[k] = a[j]
            j += 1
            k += 1
        for i in range(l, r + 1):
            a[i] = temp[i]

    while True:
        runs = []
        i = 0
        while i < n:
            j = i
            while j + 1 < n and a[j] <= a[j + 1]:
                j += 1
            runs.append((i, j))
            i = j + 1

        if len(runs) == 1:
            break

        for k in range(0, len(runs), 2):
            if k + 1 < len(runs):
                l1, r1 = runs[k]
                l2, r2 = runs[k + 1]
                merge(l1, r1, r2)

    return a
func NaturalMergeSort(a []int) {
	n := len(a)
	temp := make([]int, n)

	var merge func(int, int, int)
	merge = func(l, mid, r int) {
		i, j, k := l, mid+1, l

		for i <= mid && j <= r {
			if a[i] <= a[j] {
				temp[k] = a[i]
				i++
			} else {
				temp[k] = a[j]
				j++
			}
			k++
		}

		for i <= mid {
			temp[k] = a[i]
			i++
			k++
		}

		for j <= r {
			temp[k] = a[j]
			j++
			k++
		}

		for i := l; i <= r; i++ {
			a[i] = temp[i]
		}
	}

	for {
		runs := [][2]int{}
		i := 0

		for i < n {
			j := i
			for j+1 < n && a[j] <= a[j+1] {
				j++
			}
			runs = append(runs, [2]int{i, j})
			i = j + 1
		}

		if len(runs) == 1 {
			break
		}

		for k := 0; k < len(runs); k += 2 {
			if k+1 < len(runs) {
				l1, r1 := runs[k][0], runs[k][1]
				l2, r2 := runs[k+1][0], runs[k+1][1]
				merge(l1, r1, r2)
			}
		}
	}
}