Skip to content

SIMD Binary Search

Use vector instructions to evaluate multiple candidate positions in a binary search step.

SIMD binary search accelerates binary search by evaluating multiple candidate comparisons in parallel. Instead of checking a single midpoint per step, the algorithm probes several positions using vector instructions and narrows the search range based on the combined results.

The structure still follows binary search, but each iteration performs more comparisons per memory access.

Problem

Given a sorted array AA of length nn and a target value xx, find an index ii such that

A[i]=x A[i] = x

or return the lower bound index if exact match is not required.

Idea

At each step, select multiple probe positions:

p0,p1,,pw1 p_0, p_1, \ldots, p_{w-1}

Load:

A[p0],A[p1],,A[pw1] A[p_0], A[p_1], \ldots, A[p_{w-1}]

Compare all values against xx using SIMD. The result is a mask that indicates which probes satisfy:

A[pk]<x A[p_k] < x

This mask determines the new search interval. The algorithm effectively advances by multiple binary decisions in a single iteration.

Algorithm (conceptual)

simd_binary_search(A, x):
    left = 0
    right = length(A)

    while right - left > threshold:
        probes = choose_multiple_midpoints(left, right)

        values = vector_load(A, probes)
        mask = vector_compare_less(values, x)

        k = highest_index_where(mask is true)

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    return scalar_binary_search(A, x, left, right)

The final scalar phase resolves the remaining small range.

Example

Let

A=[5,10,15,20,25,30,35,40] A = [5, 10, 15, 20, 25, 30, 35, 40]

and

x=26 x = 26

Suppose we probe:

[2,4,6,7][15,25,35,40] [2, 4, 6, 7] \rightarrow [15, 25, 35, 40]

Compare with x=26x = 26:

indexvaluevalue < x
215yes
425yes
635no
740no

The last true position is index 44, so update:

left=5 left = 5

Continue until the interval is small.

Correctness

Each iteration partitions the search space using multiple comparisons that preserve ordering constraints. The largest index satisfying A[i]<xA[i] < x determines the lower boundary of the remaining range.

The invariant is the same as in binary search: all elements before left are strictly less than xx, and all elements at or beyond right are greater than or equal to xx.

The final scalar search ensures exact resolution within the remaining interval.

Complexity

Let ww be the SIMD width.

measurecost
iterationsO(logn/logw)O(\log n / \log w)
comparisons per stepww
total comparisonsO(logn)O(\log n)

In practice, the number of steps decreases because each iteration eliminates more of the search space.

Space:

O(1) O(1)

When to Use

Use SIMD binary search when:

  • arrays are large and sorted
  • vector instructions are available
  • memory bandwidth and latency dominate cost
  • comparisons are simple and uniform
  • high throughput search is required

It is common in databases, search engines, and columnar processing systems.

Implementation

Portable SIMD binary search is difficult in high level languages. The following shows the structure using block probes.

def simd_binary_search_shape(a, x):
    left, right = 0, len(a)

    while right - left > 4:
        step = (right - left) // 5
        probes = [left + step, left + 2*step, left + 3*step, left + 4*step]

        k = -1
        for i, p in enumerate(probes):
            if a[p] < x:
                k = i

        if k == -1:
            right = probes[0]
        else:
            left = probes[k] + 1

    while left < right:
        mid = (left + right) // 2
        if a[mid] < x:
            left = mid + 1
        else:
            right = mid

    return left
func SIMDBinarySearchShape(a []int, x int) int {
    left, right := 0, len(a)

    for right-left > 4 {
        step := (right - left) / 5

        probes := []int{
            left + step,
            left + 2*step,
            left + 3*step,
            left + 4*step,
        }

        k := -1
        for i, p := range probes {
            if a[p] < x {
                k = i
            }
        }

        if k == -1 {
            right = probes[0]
        } else {
            left = probes[k] + 1
        }
    }

    for left < right {
        mid := (left + right) / 2
        if a[mid] < x {
            left = mid + 1
        } else {
            right = mid
        }
    }

    return left
}