# Merge Path Search

# Merge Path Search

Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing.

The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number of merged output elements.

## Problem

Given two sorted arrays $A$ and $B$, find indices $i$ and $j$ such that:

$$
i + j = k
$$

and the first $k$ elements of the merged output come from:

$$
A[0:i] \quad \text{and} \quad B[0:j]
$$

The pair $(i, j)$ is the merge partition for output position $k$.

## Idea

For a fixed output rank $k$, we need a split:

$$
j = k - i
$$

The split is valid when:

$$
A[i - 1] \le B[j]
$$

and

$$
B[j - 1] \le A[i]
$$

Boundary cases treat missing values as negative or positive infinity.

Because increasing $i$ decreases $j$, the validity condition is monotone. We can binary search for the correct $i$.

## Algorithm

```text id="mps01"
merge_path_search(A, B, k):
    low = max(0, k - length(B))
    high = min(k, length(A))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < length(B) and A[i - 1] > B[j]:
            high = i - 1
        elif j > 0 and i < length(A) and B[j - 1] > A[i]:
            low = i + 1
        else:
            return (i, j)
```

## Example

Let:

$$
A = [2, 6, 9, 13]
$$

$$
B = [1, 4, 8, 10, 15]
$$

Find the partition for:

$$
k = 5
$$

The first five merged elements are:

$$
[1, 2, 4, 6, 8]
$$

So the split is:

$$
i = 2,\quad j = 3
$$

because:

$$
A[0:2] = [2, 6]
$$

and:

$$
B[0:3] = [1, 4, 8]
$$

Check:

| condition           | value      |
| ------------------- | ---------- |
| $A[i - 1] \le B[j]$ | $6 \le 10$ |
| $B[j - 1] \le A[i]$ | $8 \le 9$  |

Both hold, so $(2, 3)$ is valid.

## Correctness

The algorithm searches over possible values of $i$. For each candidate, $j$ is fixed by $j = k - i$.

If $A[i - 1] > B[j]$, then too many elements were taken from $A$, so `high` moves left. If $B[j - 1] > A[i]$, then too few elements were taken from $A`, so `low` moves right.

When neither condition holds, all elements before the partition are less than or equal to all elements after the partition. Therefore the split gives exactly the first $k$ merged elements.

## Complexity

| operation            | time            |   |   |   |     |
| -------------------- | --------------- | - | - | - | --- |
| one partition search | $O(\log \min(   | A | , | B | ))$ |
| $p$ partitions       | $O(p \log \min( | A | , | B | ))$ |

Space:

$$
O(1)
$$

For parallel merge, each worker gets an output range $[k_1, k_2)$ and computes two merge path searches to find its input ranges.

## When to Use

Use merge path search when:

* two sorted arrays must be merged
* merge work should be split evenly
* parallel workers need independent ranges
* GPU threads need balanced merge partitions
* output positions matter directly

It is a core primitive for parallel merge sort and high throughput sorted joins.

## Implementation

```python id="mps02"
def merge_path_search(a, b, k):
    low = max(0, k - len(b))
    high = min(k, len(a))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < len(b) and a[i - 1] > b[j]:
            high = i - 1
        elif j > 0 and i < len(a) and b[j - 1] > a[i]:
            low = i + 1
        else:
            return i, j

    return low, k - low
```

```go id="mps03"
func MergePathSearch(a []int, b []int, k int) (int, int) {
    low := 0
    if k-len(b) > low {
        low = k - len(b)
    }

    high := k
    if len(a) < high {
        high = len(a)
    }

    for low <= high {
        i := (low + high) / 2
        j := k - i

        if i > 0 && j < len(b) && a[i-1] > b[j] {
            high = i - 1
        } else if j > 0 && i < len(a) && b[j-1] > a[i] {
            low = i + 1
        } else {
            return i, j
        }
    }

    return low, k - low
}
```

