Skip to content

Merge Path Search

Find diagonal partition points that split a sorted merge into independent balanced ranges.

Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing.

The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number of merged output elements.

Problem

Given two sorted arrays AA and BB, find indices ii and jj such that:

i+j=k i + j = k

and the first kk elements of the merged output come from:

A[0:i]andB[0:j] A[0:i] \quad \text{and} \quad B[0:j]

The pair (i,j)(i, j) is the merge partition for output position kk.

Idea

For a fixed output rank kk, we need a split:

j=ki j = k - i

The split is valid when:

A[i1]B[j] A[i - 1] \le B[j]

and

B[j1]A[i] B[j - 1] \le A[i]

Boundary cases treat missing values as negative or positive infinity.

Because increasing ii decreases jj, the validity condition is monotone. We can binary search for the correct ii.

Algorithm

merge_path_search(A, B, k):
    low = max(0, k - length(B))
    high = min(k, length(A))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < length(B) and A[i - 1] > B[j]:
            high = i - 1
        elif j > 0 and i < length(A) and B[j - 1] > A[i]:
            low = i + 1
        else:
            return (i, j)

Example

Let:

A=[2,6,9,13] A = [2, 6, 9, 13] B=[1,4,8,10,15] B = [1, 4, 8, 10, 15]

Find the partition for:

k=5 k = 5

The first five merged elements are:

[1,2,4,6,8] [1, 2, 4, 6, 8]

So the split is:

i=2,j=3 i = 2,\quad j = 3

because:

A[0:2]=[2,6] A[0:2] = [2, 6]

and:

B[0:3]=[1,4,8] B[0:3] = [1, 4, 8]

Check:

conditionvalue
A[i1]B[j]A[i - 1] \le B[j]6106 \le 10
B[j1]A[i]B[j - 1] \le A[i]898 \le 9

Both hold, so (2,3)(2, 3) is valid.

Correctness

The algorithm searches over possible values of ii. For each candidate, jj is fixed by j=kij = k - i.

If A[i1]>B[j]A[i - 1] > B[j], then too many elements were taken from AA, so high moves left. If B[j1]>A[i]B[j - 1] > A[i], then too few elements were taken from $A, so low` moves right.

When neither condition holds, all elements before the partition are less than or equal to all elements after the partition. Therefore the split gives exactly the first kk merged elements.

Complexity

operationtime
one partition search$O(\log \min(A,B))$
pp partitions$O(p \log \min(A,B))$

Space:

O(1) O(1)

For parallel merge, each worker gets an output range [k1,k2)[k_1, k_2) and computes two merge path searches to find its input ranges.

When to Use

Use merge path search when:

  • two sorted arrays must be merged
  • merge work should be split evenly
  • parallel workers need independent ranges
  • GPU threads need balanced merge partitions
  • output positions matter directly

It is a core primitive for parallel merge sort and high throughput sorted joins.

Implementation

def merge_path_search(a, b, k):
    low = max(0, k - len(b))
    high = min(k, len(a))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < len(b) and a[i - 1] > b[j]:
            high = i - 1
        elif j > 0 and i < len(a) and b[j - 1] > a[i]:
            low = i + 1
        else:
            return i, j

    return low, k - low
func MergePathSearch(a []int, b []int, k int) (int, int) {
    low := 0
    if k-len(b) > low {
        low = k - len(b)
    }

    high := k
    if len(a) < high {
        high = len(a)
    }

    for low <= high {
        i := (low + high) / 2
        j := k - i

        if i > 0 && j < len(b) && a[i-1] > b[j] {
            high = i - 1
        } else if j > 0 && i < len(a) && b[j-1] > a[i] {
            low = i + 1
        } else {
            return i, j
        }
    }

    return low, k - low
}