Find diagonal partition points that split a sorted merge into independent balanced ranges.
Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing.
The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number of merged output elements.
Problem
Given two sorted arrays and , find indices and such that:
and the first elements of the merged output come from:
The pair is the merge partition for output position .
Idea
For a fixed output rank , we need a split:
The split is valid when:
and
Boundary cases treat missing values as negative or positive infinity.
Because increasing decreases , the validity condition is monotone. We can binary search for the correct .
Algorithm
merge_path_search(A, B, k):
low = max(0, k - length(B))
high = min(k, length(A))
while low <= high:
i = (low + high) // 2
j = k - i
if i > 0 and j < length(B) and A[i - 1] > B[j]:
high = i - 1
elif j > 0 and i < length(A) and B[j - 1] > A[i]:
low = i + 1
else:
return (i, j)Example
Let:
Find the partition for:
The first five merged elements are:
So the split is:
because:
and:
Check:
| condition | value |
|---|---|
Both hold, so is valid.
Correctness
The algorithm searches over possible values of . For each candidate, is fixed by .
If , then too many elements were taken from , so high moves left. If , then too few elements were taken from $A, so low` moves right.
When neither condition holds, all elements before the partition are less than or equal to all elements after the partition. Therefore the split gives exactly the first merged elements.
Complexity
| operation | time | ||||
|---|---|---|---|---|---|
| one partition search | $O(\log \min( | A | , | B | ))$ |
| partitions | $O(p \log \min( | A | , | B | ))$ |
Space:
For parallel merge, each worker gets an output range and computes two merge path searches to find its input ranges.
When to Use
Use merge path search when:
- two sorted arrays must be merged
- merge work should be split evenly
- parallel workers need independent ranges
- GPU threads need balanced merge partitions
- output positions matter directly
It is a core primitive for parallel merge sort and high throughput sorted joins.
Implementation
def merge_path_search(a, b, k):
low = max(0, k - len(b))
high = min(k, len(a))
while low <= high:
i = (low + high) // 2
j = k - i
if i > 0 and j < len(b) and a[i - 1] > b[j]:
high = i - 1
elif j > 0 and i < len(a) and b[j - 1] > a[i]:
low = i + 1
else:
return i, j
return low, k - lowfunc MergePathSearch(a []int, b []int, k int) (int, int) {
low := 0
if k-len(b) > low {
low = k - len(b)
}
high := k
if len(a) < high {
high = len(a)
}
for low <= high {
i := (low + high) / 2
j := k - i
if i > 0 && j < len(b) && a[i-1] > b[j] {
high = i - 1
} else if j > 0 && i < len(a) && b[j-1] > a[i] {
low = i + 1
} else {
return i, j
}
}
return low, k - low
}