# SIMD Linear Search

# SIMD Linear Search

SIMD linear search is a linear search variant that compares several values at the same time using vector instructions. SIMD means single instruction, multiple data. Instead of comparing one array element per loop iteration, the algorithm loads a block of elements into a vector register and compares all lanes against the target in one operation.

The logical algorithm remains linear search. The speedup comes from processing multiple elements per CPU instruction.

## Problem

Given an array $A$ of length $n$ and a target value $x$, find an index $i$ such that

$$
A[i] = x
$$

If no such index exists, return $-1$.

## Idea

Use a vector width $w$. Each iteration loads $w$ elements:

$$
A[i], A[i+1], \ldots, A[i+w-1]
$$

Compare all $w$ elements against $x$ in parallel. The comparison produces a bit mask. If the mask is nonzero, at least one lane matched the target. The first set bit gives the first matching position inside the vector.

## Algorithm

```text id="sl01"
simd_linear_search(A, x):
    i = 0
    w = vector_width()

    while i + w <= length(A):
        values = vector_load(A, i)
        mask = vector_compare_equal(values, x)

        if mask != 0:
            lane = first_set_bit(mask)
            return i + lane

        i = i + w

    while i < length(A):
        if A[i] == x:
            return i
        i = i + 1

    return -1
```

The final scalar loop handles the remaining elements when the array length is not a multiple of the vector width.

## Example

Suppose the vector width is $4$:

$$
A = [7, 9, 12, 15, 18, 21, 24, 30]
$$

and

$$
x = 21
$$

The first vector compares:

$$
[7, 9, 12, 15]
$$

No match. The second vector compares:

$$
[18, 21, 24, 30]
$$

The mask has lane $1$ set, so the answer is:

$$
4 + 1 = 5
$$

## Correctness

The algorithm compares every array element with the target. The SIMD loop compares elements in contiguous groups of width $w$. If a group contains a match, the mask identifies at least one matching lane, and selecting the first set bit returns the earliest match in that group.

If no SIMD group contains the target, the scalar tail checks every remaining element. Therefore the algorithm returns a valid matching index if one exists, and returns $-1$ only after all elements have been checked.

## Complexity

| case  | scalar comparisons |     vector iterations | time                    |
| ----- | -----------------: | --------------------: | ----------------------- |
| best  |     $1$ lane group |                   $1$ | $O(1)$                  |
| worst |  $n$ lanes checked | $\lceil n / w \rceil$ | $O(n / w)$ vector steps |

The asymptotic model usually still describes the algorithm as:

$$
O(n)
$$

because $w$ is a fixed hardware constant.

Space usage is:

$$
O(1)
$$

## When to Use

Use SIMD linear search when:

* the data is unsorted
* many elements must be scanned
* values are stored contiguously
* vector instructions are available
* the comparison operation is simple

It is especially effective for byte arrays, integer arrays, string scanning, filtering, and parsing workloads.

## Implementation

Python does not expose portable low level SIMD directly, but NumPy uses vectorized native loops internally.

```python id="sl02"
import numpy as np

def simd_linear_search_numpy(a, x):
    arr = np.asarray(a)
    matches = np.flatnonzero(arr == x)
    if len(matches) == 0:
        return -1
    return int(matches[0])
```

A Go implementation can show the block scanning structure. Actual SIMD requires architecture specific intrinsics or assembly.

```go id="sl03"
func SIMDLinearSearchShape(a []int, x int) int {
    const width = 4
    n := len(a)
    i := 0

    for i+width <= n {
        if a[i] == x {
            return i
        }
        if a[i+1] == x {
            return i + 1
        }
        if a[i+2] == x {
            return i + 2
        }
        if a[i+3] == x {
            return i + 3
        }
        i += width
    }

    for i < n {
        if a[i] == x {
            return i
        }
        i++
    }

    return -1
}
```

