Branchless Lower Bound

Branchless lower bound computes the first position where a value is greater than or equal to a target, without using unpredictable branches. It follows the same logical structure as binary search but replaces control flow with arithmetic updates or conditional moves.

The result is the same as standard lower bound. The benefit is improved performance on modern CPUs when branch misprediction is expensive.

Problem

Given a sorted array $A$ of length $n$ and a target $x$ , find the smallest index $i$ such that

A[i] \ge x

If all elements are smaller than $x$ , return $n$ .

Idea

Maintain a base index that always points to the largest known position where the value is less than $x$ .

At each step, test whether a candidate position also satisfies the condition. Update the base using a conditional assignment that can compile to a branchless instruction.

The search proceeds by testing powers of two, from large to small.

Algorithm

branchless_lower_bound(A, x):
    n = length(A)
    base = -1

    step = 1
    while step * 2 <= n:
        step = step * 2

    while step > 0:
        next = base + step

        cond = (next < n) and (A[next] < x)

        if cond:
            base = next

        step = step // 2

    return base + 1

In optimized implementations, the if cond update becomes:

base = cond ? next : base

which compiles to a conditional move.

Example

Let

A = [5, 10, 15, 20, 25, 30, 35, 40]

and

x = 22

Trace:

step	candidate	value	value < x	base
4	3	20	yes	3
2	5	30	no	3
1	4	25	no	3

Return:

3 + 1 = 4

So index $4$ is the first position where $A[i] \ge x$ .

Correctness

The invariant maintains that base is the largest index such that $A[\text{base}] < x$ .

Each step tests whether a larger candidate also satisfies the condition. If so, the invariant extends forward. Otherwise, the invariant remains unchanged.

When all step sizes have been tested, no larger valid index exists. Therefore the next position, base + 1, is the smallest index such that $A[i] \ge x$ .

Complexity

case	time
all cases	$O(\log n)$

Space:

O(1)

The number of comparisons matches binary search. The difference lies in execution behavior, not asymptotic complexity.

When to Use

Use branchless lower bound when:

arrays are large
queries are frequent
branch misprediction affects performance
predictable execution matters more than code simplicity

For small inputs, a standard lower bound is often simpler and sufficient.

Implementation

def branchless_lower_bound(a, x):
    n = len(a)
    base = -1

    step = 1
    while step * 2 <= n:
        step *= 2

    while step > 0:
        nxt = base + step
        if nxt < n and a[nxt] < x:
            base = nxt
        step //= 2

    return base + 1

func BranchlessLowerBound(a []int, x int) int {
    n := len(a)
    base := -1

    step := 1
    for step*2 <= n {
        step *= 2
    }

    for step > 0 {
        next := base + step
        if next < n && a[next] < x {
            base = next
        }
        step /= 2
    }

    return base + 1
}