Use vector instructions to evaluate multiple candidate positions in a binary search step.
SIMD binary search accelerates binary search by evaluating multiple candidate comparisons in parallel. Instead of checking a single midpoint per step, the algorithm probes several positions using vector instructions and narrows the search range based on the combined results.
The structure still follows binary search, but each iteration performs more comparisons per memory access.
Problem
Given a sorted array of length and a target value , find an index such that
or return the lower bound index if exact match is not required.
Idea
At each step, select multiple probe positions:
Load:
Compare all values against using SIMD. The result is a mask that indicates which probes satisfy:
This mask determines the new search interval. The algorithm effectively advances by multiple binary decisions in a single iteration.
Algorithm (conceptual)
simd_binary_search(A, x):
left = 0
right = length(A)
while right - left > threshold:
probes = choose_multiple_midpoints(left, right)
values = vector_load(A, probes)
mask = vector_compare_less(values, x)
k = highest_index_where(mask is true)
if k == -1:
right = probes[0]
else:
left = probes[k] + 1
return scalar_binary_search(A, x, left, right)The final scalar phase resolves the remaining small range.
Example
Let
and
Suppose we probe:
Compare with :
| index | value | value < x |
|---|---|---|
| 2 | 15 | yes |
| 4 | 25 | yes |
| 6 | 35 | no |
| 7 | 40 | no |
The last true position is index , so update:
Continue until the interval is small.
Correctness
Each iteration partitions the search space using multiple comparisons that preserve ordering constraints. The largest index satisfying determines the lower boundary of the remaining range.
The invariant is the same as in binary search: all elements before left are strictly less than , and all elements at or beyond right are greater than or equal to .
The final scalar search ensures exact resolution within the remaining interval.
Complexity
Let be the SIMD width.
| measure | cost |
|---|---|
| iterations | |
| comparisons per step | |
| total comparisons |
In practice, the number of steps decreases because each iteration eliminates more of the search space.
Space:
When to Use
Use SIMD binary search when:
- arrays are large and sorted
- vector instructions are available
- memory bandwidth and latency dominate cost
- comparisons are simple and uniform
- high throughput search is required
It is common in databases, search engines, and columnar processing systems.
Implementation
Portable SIMD binary search is difficult in high level languages. The following shows the structure using block probes.
def simd_binary_search_shape(a, x):
left, right = 0, len(a)
while right - left > 4:
step = (right - left) // 5
probes = [left + step, left + 2*step, left + 3*step, left + 4*step]
k = -1
for i, p in enumerate(probes):
if a[p] < x:
k = i
if k == -1:
right = probes[0]
else:
left = probes[k] + 1
while left < right:
mid = (left + right) // 2
if a[mid] < x:
left = mid + 1
else:
right = mid
return leftfunc SIMDBinarySearchShape(a []int, x int) int {
left, right := 0, len(a)
for right-left > 4 {
step := (right - left) / 5
probes := []int{
left + step,
left + 2*step,
left + 3*step,
left + 4*step,
}
k := -1
for i, p := range probes {
if a[p] < x {
k = i
}
}
if k == -1 {
right = probes[0]
} else {
left = probes[k] + 1
}
}
for left < right {
mid := (left + right) / 2
if a[mid] < x {
left = mid + 1
} else {
right = mid
}
}
return left
}