SIMD Binary Search

SIMD binary search accelerates binary search by evaluating multiple candidate comparisons in parallel. Instead of checking a single midpoint per step, the algorithm probes several positions using vector instructions and narrows the search range based on the combined results.

The structure still follows binary search, but each iteration performs more comparisons per memory access.

Problem

Given a sorted array $A$ of length $n$ and a target value $x$ , find an index $i$ such that

A[i] = x

or return the lower bound index if exact match is not required.

Idea

At each step, select multiple probe positions:

p_0, p_1, \ldots, p_{w-1}

Load:

A[p_0], A[p_1], \ldots, A[p_{w-1}]

Compare all values against $x$ using SIMD. The result is a mask that indicates which probes satisfy:

A[p_k] < x

This mask determines the new search interval. The algorithm effectively advances by multiple binary decisions in a single iteration.

Algorithm (conceptual)

simd_binary_search(A, x):
    left = 0
    right = length(A)

    while right - left > threshold:
        probes = choose_multiple_midpoints(left, right)

        values = vector_load(A, probes)
        mask = vector_compare_less(values, x)

        k = highest_index_where(mask is true)

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    return scalar_binary_search(A, x, left, right)

The final scalar phase resolves the remaining small range.

Example

Let

A = [5, 10, 15, 20, 25, 30, 35, 40]

and

x = 26

Suppose we probe:

[2, 4, 6, 7] \rightarrow [15, 25, 35, 40]

Compare with $x = 26$ :

index	value	value < x
2	15	yes
4	25	yes
6	35	no
7	40	no

The last true position is index $4$ , so update:

left = 5

Continue until the interval is small.

Correctness

Each iteration partitions the search space using multiple comparisons that preserve ordering constraints. The largest index satisfying $A[i] < x$ determines the lower boundary of the remaining range.

The invariant is the same as in binary search: all elements before left are strictly less than $x$ , and all elements at or beyond right are greater than or equal to $x$ .

The final scalar search ensures exact resolution within the remaining interval.

Complexity

Let $w$ be the SIMD width.

measure	cost
iterations	$O(\log n / \log w)$
comparisons per step	$w$
total comparisons	$O(\log n)$

In practice, the number of steps decreases because each iteration eliminates more of the search space.

Space:

O(1)

When to Use

Use SIMD binary search when:

arrays are large and sorted
vector instructions are available
memory bandwidth and latency dominate cost
comparisons are simple and uniform
high throughput search is required

It is common in databases, search engines, and columnar processing systems.

Implementation

Portable SIMD binary search is difficult in high level languages. The following shows the structure using block probes.

def simd_binary_search_shape(a, x):
    left, right = 0, len(a)

    while right - left > 4:
        step = (right - left) // 5
        probes = [left + step, left + 2*step, left + 3*step, left + 4*step]

        k = -1
        for i, p in enumerate(probes):
            if a[p] < x:
                k = i

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    while left < right:
        mid = (left + right) // 2
        if a[mid] < x:
            left = mid + 1
        else:
            right = mid

    return left

func SIMDBinarySearchShape(a []int, x int) int {
    left, right := 0, len(a)

    for right-left > 4 {
        step := (right - left) / 5

        probes := []int{
            left + step,
            left + 2*step,
            left + 3*step,
            left + 4*step,
        }

        k := -1
        for i, p := range probes {
            if a[p] < x {
                k = i
            }
        }

        if k == -1 {
            right = probes[0]
        } else {
            left = probes[k] + 1
        }
    }

    for left < right {
        mid := (left + right) / 2
        if a[mid] < x {
            left = mid + 1
        } else {
            right = mid
        }
    }

    return left
}