Skip to content

6.4 Merge Sort

Divide the input into halves, recursively sort each half, then merge them — combining local order into global order in O(n log n) time.

Merge sort is the standard example of divide and conquer sorting. It divides the input into two smaller arrays, sorts each half, and then merges the two sorted halves into one sorted output. The central operation is not the division step. It is the merge step, because merging is where the algorithm combines local order into global order.

Problem

You have an array A[0..n-1]. You want a sorting algorithm with guaranteed O(nlogn)O(n \log n) time, stable behavior, and predictable performance regardless of input order.

Solution

Split the array into two halves, recursively sort both halves, and merge the sorted halves.

merge_sort(A):
  if length(A) <= 1:
    return A

  mid = length(A) / 2
  left = merge_sort(A[0..mid-1])
  right = merge_sort(A[mid..n-1])

  return merge(left, right)

The merge procedure scans both sorted inputs from left to right and repeatedly takes the smaller front element.

merge(left, right):
  result = empty array
  i = 0
  j = 0

  while i < length(left) and j < length(right):
    if left[i] <= right[j]:
      append left[i] to result
      i = i + 1
    else:
      append right[j] to result
      j = j + 1

  append all remaining elements of left[i..]
  append all remaining elements of right[j..]

  return result

The comparison uses left[i] <= right[j], not left[i] < right[j]. This is the detail that makes the usual implementation stable. When two keys are equal, the element from the left half is emitted first, so equal elements keep the order they had in the original input.

Example

Start with:

[5, 2, 4, 1, 3]

Split into:

[5, 2] and [4, 1, 3]

Sort each half:

[2, 5] and [1, 3, 4]

Now merge. Compare the front elements 2 and 1. Emit 1:

[1]

Compare 2 and 3. Emit 2:

[1, 2]

Compare 5 and 3. Emit 3:

[1, 2, 3]

Compare 5 and 4. Emit 4:

[1, 2, 3, 4]

Append the remaining 5:

[1, 2, 3, 4, 5]

The merge step never searches backward. It uses the fact that each input half is already sorted.

Merge Invariant

During merge(left, right), after each append to result, the following invariant holds:

The array result is sorted and contains exactly the smallest elements from the union of left[0..i-1] and right[0..j-1], in correct sorted order.

More concretely, the unmerged candidates are always left[i] and right[j]. Since both input arrays are sorted, no later element in left can be smaller than left[i], and no later element in right can be smaller than right[j]. Therefore the smaller of these two front elements is the smallest remaining element overall.

This invariant proves the correctness of the merge procedure. At termination, one input has no remaining elements. The rest of the other input can be appended directly because it is already sorted and every previously emitted element is less than or equal to its first remaining element.

Sorting Invariant

Merge sort relies on a recursive correctness statement:

For every array of length at most n, merge_sort returns a sorted permutation of that array.

The base case is length 0 or 1. Such an array is already sorted, and it is trivially a permutation of itself.

For the recursive case, assume the statement holds for arrays shorter than n. The algorithm splits the input into two smaller arrays. By the induction hypothesis, each recursive call returns a sorted permutation of its half. The merge procedure then combines those two sorted permutations into a sorted permutation of the whole input.

Therefore merge sort is correct for arrays of length n.

Complexity

At each recursion level, every element participates in one merge. The total work per level is linear:

O(n)

The number of levels is logarithmic because each split roughly halves the array:

O(log n)

So the total running time is:

O(n log n)

This bound holds in the best case, average case, and worst case. Merge sort does not become quadratic on already sorted, reverse sorted, or adversarial inputs.

The usual array implementation needs extra memory for merging:

O(n)

There are in-place merge sort variants, but they are more complex and often slower in practice. The ordinary version is preferred when stability and predictable performance matter more than constant auxiliary memory.

Stability

Merge sort is stable if the merge step chooses from the left side when keys are equal.

Consider records sorted by key:

[(2, A), (1, B), (2, C)]

During splitting, (2, A) remains in an earlier half than (2, C) or appears earlier within a half. During merging, equal keys from the left side are emitted first. The final result is:

[(1, B), (2, A), (2, C)]

The relative order of (2, A) and (2, C) is preserved.

If the merge step used right[j] first on equal keys, the algorithm could reverse equal elements across halves and lose stability.

Bottom-Up Merge Sort

Merge sort can also be written iteratively. Instead of recursive splitting, sort runs of length 1, then merge adjacent runs into length 2, then length 4, then length 8, and so on.

bottom_up_merge_sort(A):
  n = length(A)
  width = 1
  buffer = new array of length n

  while width < n:
    for lo = 0 to n - 1 step 2 * width:
      mid = min(lo + width, n)
      hi = min(lo + 2 * width, n)
      merge A[lo..mid-1] and A[mid..hi-1] into buffer[lo..hi-1]

    copy buffer back into A
    width = 2 * width

The bottom-up version has the same asymptotic complexity as the recursive version. It avoids recursion overhead and gives direct control over memory buffers. It is often a good form for production code because the data movement pattern is explicit.

Implementation Notes

Merge sort is a strong default when the following conditions matter:

stable output
guaranteed O(n log n) time
predictable behavior on adversarial input
sequential access pattern

It is especially useful for linked lists, because merging linked lists can be done by pointer rewiring rather than by copying array elements. For arrays, the extra memory cost is the main tradeoff.

When implementing merge sort for arrays, allocate the auxiliary buffer once and reuse it. Allocating a new temporary array at every recursive call adds unnecessary overhead.

A practical recursive implementation often has this shape:

merge_sort_range(A, buffer, lo, hi):
  if hi - lo <= 1:
    return

  mid = lo + (hi - lo) / 2

  merge_sort_range(A, buffer, lo, mid)
  merge_sort_range(A, buffer, mid, hi)

  merge_ranges(A, buffer, lo, mid, hi)

The half-open interval convention [lo, hi) avoids many boundary errors. The length of the range is always hi - lo, and the midpoint belongs to the right half.

Common Bugs

A common bug is using closed intervals inconsistently. If one helper expects [lo, hi] and another expects [lo, hi), the implementation will either skip an element or read past the end.

Another common bug is copying merged data into the buffer but forgetting to copy it back into the source array. Some implementations alternate the roles of source and destination arrays at each recursion level, but that design must be applied consistently.

A stability bug appears when the merge comparison uses < instead of <=. The output may still be sorted, but equal records can change order.

A performance bug appears when each recursive call allocates its own temporary array. The algorithm remains O(nlogn)O(n \log n) in comparison count, but memory allocation overhead can dominate real runtime.