# SIMD Binary Search

# SIMD Binary Search

SIMD binary search accelerates binary search by evaluating multiple candidate comparisons in parallel. Instead of checking a single midpoint per step, the algorithm probes several positions using vector instructions and narrows the search range based on the combined results.

The structure still follows binary search, but each iteration performs more comparisons per memory access.

## Problem

Given a sorted array $A$ of length $n$ and a target value $x$, find an index $i$ such that

$$
A[i] = x
$$

or return the lower bound index if exact match is not required.

## Idea

At each step, select multiple probe positions:

$$
p_0, p_1, \ldots, p_{w-1}
$$

Load:

$$
A[p_0], A[p_1], \ldots, A[p_{w-1}]
$$

Compare all values against $x$ using SIMD. The result is a mask that indicates which probes satisfy:

$$
A[p_k] < x
$$

This mask determines the new search interval. The algorithm effectively advances by multiple binary decisions in a single iteration.

## Algorithm (conceptual)

```text id="sb01"
simd_binary_search(A, x):
    left = 0
    right = length(A)

    while right - left > threshold:
        probes = choose_multiple_midpoints(left, right)

        values = vector_load(A, probes)
        mask = vector_compare_less(values, x)

        k = highest_index_where(mask is true)

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    return scalar_binary_search(A, x, left, right)
```

The final scalar phase resolves the remaining small range.

## Example

Let

$$
A = [5, 10, 15, 20, 25, 30, 35, 40]
$$

and

$$
x = 26
$$

Suppose we probe:

$$
[2, 4, 6, 7] \rightarrow [15, 25, 35, 40]
$$

Compare with $x = 26$:

| index | value | value < x |
| ----- | ----: | --------- |
| 2     |    15 | yes       |
| 4     |    25 | yes       |
| 6     |    35 | no        |
| 7     |    40 | no        |

The last true position is index $4$, so update:

$$
left = 5
$$

Continue until the interval is small.

## Correctness

Each iteration partitions the search space using multiple comparisons that preserve ordering constraints. The largest index satisfying $A[i] < x$ determines the lower boundary of the remaining range.

The invariant is the same as in binary search: all elements before `left` are strictly less than $x$, and all elements at or beyond `right` are greater than or equal to $x$.

The final scalar search ensures exact resolution within the remaining interval.

## Complexity

Let $w$ be the SIMD width.

| measure              | cost                 |
| -------------------- | -------------------- |
| iterations           | $O(\log n / \log w)$ |
| comparisons per step | $w$                  |
| total comparisons    | $O(\log n)$          |

In practice, the number of steps decreases because each iteration eliminates more of the search space.

Space:

$$
O(1)
$$

## When to Use

Use SIMD binary search when:

* arrays are large and sorted
* vector instructions are available
* memory bandwidth and latency dominate cost
* comparisons are simple and uniform
* high throughput search is required

It is common in databases, search engines, and columnar processing systems.

## Implementation

Portable SIMD binary search is difficult in high level languages. The following shows the structure using block probes.

```python id="sb02"
def simd_binary_search_shape(a, x):
    left, right = 0, len(a)

    while right - left > 4:
        step = (right - left) // 5
        probes = [left + step, left + 2*step, left + 3*step, left + 4*step]

        k = -1
        for i, p in enumerate(probes):
            if a[p] < x:
                k = i

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    while left < right:
        mid = (left + right) // 2
        if a[mid] < x:
            left = mid + 1
        else:
            right = mid

    return left
```

```go id="sb03"
func SIMDBinarySearchShape(a []int, x int) int {
    left, right := 0, len(a)

    for right-left > 4 {
        step := (right - left) / 5

        probes := []int{
            left + step,
            left + 2*step,
            left + 3*step,
            left + 4*step,
        }

        k := -1
        for i, p := range probes {
            if a[p] < x {
                k = i
            }
        }

        if k == -1 {
            right = probes[0]
        } else {
            left = probes[k] + 1
        }
    }

    for left < right {
        mid := (left + right) / 2
        if a[mid] < x {
            left = mid + 1
        } else {
            right = mid
        }
    }

    return left
}
```

