Parallel Binary Search

Parallel binary search answers many queries by running binary search for all of them at once. Instead of solving each query independently, it groups queries by their current midpoint and evaluates them in batches.

This technique applies when each query asks for the smallest or largest index where a monotone condition becomes true.

Problem

Given:

a sequence of updates or states indexed by time or position
a set of queries

Each query asks:

\text{find the smallest } t \text{ such that } F(q, t) = true

where $F$ is monotone in $t$ .

Key Idea

Each query maintains its own search interval $[l, r]$ . At each round:

Compute midpoint $mid$ for every query
Group queries by their midpoint
Process updates in order
Answer all queries with the same midpoint together
Narrow each query’s interval

This reduces repeated work across queries.

Algorithm

parallel_binary_search(queries, updates):
    initialize l[q] = 0, r[q] = max_time for each query q

    while there exists q with l[q] < r[q]:
        buckets = empty lists

        for each query q:
            if l[q] < r[q]:
                mid = (l[q] + r[q]) // 2
                buckets[mid].append(q)

        reset data structure

        for t from 0 to max_time:
            apply update at time t

            for each query q in buckets[t]:
                if F(q) is true:
                    r[q] = t
                else:
                    l[q] = t + 1

    return answers l[q]

Example

Suppose we have:

an array that changes over time
queries asking when a value first exceeds a threshold

Each query independently would cost:

O(\log n \cdot T_F)

Parallel binary search reduces repeated work by evaluating all queries at each time step only once per round.

Correctness

Each query maintains the invariant:

the answer lies in $[l[q], r[q]]$

At each round, the midpoint splits the interval.

If $F(q, mid)$ is true, the answer must lie in $[l[q], mid]$ . Otherwise it lies in $[mid + 1, r[q]]$ .

Grouping queries by midpoint does not affect correctness because each query is still evaluated at the correct mid value.

Processing updates in order ensures that the data structure reflects the correct state for each time.

Complexity

Let:

$Q$ be number of queries
$T$ be number of update steps
$T_F$ be cost of evaluating all queries at one step

cost	value
rounds	$O(\log T)$
total time	$O((T + Q) \log T)$
space	$O(Q + T)$

The key advantage is that updates are applied once per round, not once per query.

Comparison

method	time
independent binary search	$O(Q \log T \cdot T_F)$
parallel binary search	$O((T + Q) \log T)$

Parallel binary search is beneficial when many queries share the same update sequence.

When to Use

Use parallel binary search when:

queries are offline
the answer space is ordered
feasibility depends on cumulative updates
many queries share the same structure

Common applications:

earliest time a condition becomes true
connectivity over time
prefix-based constraints
dynamic threshold queries

Notes

Requires resetting the data structure each round
Works best when updates can be replayed efficiently
Often combined with Fenwick trees, segment trees, or DSU

Implementation Sketch

def parallel_binary_search(queries, max_time, apply_update, check):
    l = [0] * len(queries)
    r = [max_time] * len(queries)

    while True:
        buckets = [[] for _ in range(max_time + 1)]
        active = False

        for i in range(len(queries)):
            if l[i] < r[i]:
                active = True
                mid = (l[i] + r[i]) // 2
                buckets[mid].append(i)

        if not active:
            break

        reset_data_structure()

        for t in range(max_time + 1):
            apply_update(t)

            for i in buckets[t]:
                if check(queries[i]):
                    r[i] = t
                else:
                    l[i] = t + 1

    return l

func ParallelBinarySearch(queries []Query, maxTime int,
    applyUpdate func(int),
    check func(Query) bool,
    reset func(),
) []int {

    n := len(queries)
    l := make([]int, n)
    r := make([]int, n)
    for i := 0; i < n; i++ {
        r[i] = maxTime
    }

    for {
        buckets := make([][]int, maxTime+1)
        active := false

        for i := 0; i < n; i++ {
            if l[i] < r[i] {
                active = true
                mid := (l[i] + r[i]) / 2
                buckets[mid] = append(buckets[mid], i)
            }
        }

        if !active {
            break
        }

        reset()

        for t := 0; t <= maxTime; t++ {
            applyUpdate(t)
            for _, i := range buckets[t] {
                if check(queries[i]) {
                    r[i] = t
                } else {
                    l[i] = t + 1
                }
            }
        }
    }

    return l
}