Answer many offline queries by searching their answers simultaneously with shared checks.
Parallel binary search answers many queries by running binary search for all of them at once. Instead of solving each query independently, it groups queries by their current midpoint and evaluates them in batches.
This technique applies when each query asks for the smallest or largest index where a monotone condition becomes true.
Problem
Given:
- a sequence of updates or states indexed by time or position
- a set of queries
Each query asks:
where is monotone in .
Key Idea
Each query maintains its own search interval . At each round:
- Compute midpoint for every query
- Group queries by their midpoint
- Process updates in order
- Answer all queries with the same midpoint together
- Narrow each query’s interval
This reduces repeated work across queries.
Algorithm
parallel_binary_search(queries, updates):
initialize l[q] = 0, r[q] = max_time for each query q
while there exists q with l[q] < r[q]:
buckets = empty lists
for each query q:
if l[q] < r[q]:
mid = (l[q] + r[q]) // 2
buckets[mid].append(q)
reset data structure
for t from 0 to max_time:
apply update at time t
for each query q in buckets[t]:
if F(q) is true:
r[q] = t
else:
l[q] = t + 1
return answers l[q]Example
Suppose we have:
- an array that changes over time
- queries asking when a value first exceeds a threshold
Each query independently would cost:
Parallel binary search reduces repeated work by evaluating all queries at each time step only once per round.
Correctness
Each query maintains the invariant:
- the answer lies in
At each round, the midpoint splits the interval.
If is true, the answer must lie in . Otherwise it lies in .
Grouping queries by midpoint does not affect correctness because each query is still evaluated at the correct mid value.
Processing updates in order ensures that the data structure reflects the correct state for each time.
Complexity
Let:
- be number of queries
- be number of update steps
- be cost of evaluating all queries at one step
| cost | value |
|---|---|
| rounds | |
| total time | |
| space |
The key advantage is that updates are applied once per round, not once per query.
Comparison
| method | time |
|---|---|
| independent binary search | |
| parallel binary search |
Parallel binary search is beneficial when many queries share the same update sequence.
When to Use
Use parallel binary search when:
- queries are offline
- the answer space is ordered
- feasibility depends on cumulative updates
- many queries share the same structure
Common applications:
- earliest time a condition becomes true
- connectivity over time
- prefix-based constraints
- dynamic threshold queries
Notes
- Requires resetting the data structure each round
- Works best when updates can be replayed efficiently
- Often combined with Fenwick trees, segment trees, or DSU
Implementation Sketch
def parallel_binary_search(queries, max_time, apply_update, check):
l = [0] * len(queries)
r = [max_time] * len(queries)
while True:
buckets = [[] for _ in range(max_time + 1)]
active = False
for i in range(len(queries)):
if l[i] < r[i]:
active = True
mid = (l[i] + r[i]) // 2
buckets[mid].append(i)
if not active:
break
reset_data_structure()
for t in range(max_time + 1):
apply_update(t)
for i in buckets[t]:
if check(queries[i]):
r[i] = t
else:
l[i] = t + 1
return lfunc ParallelBinarySearch(queries []Query, maxTime int,
applyUpdate func(int),
check func(Query) bool,
reset func(),
) []int {
n := len(queries)
l := make([]int, n)
r := make([]int, n)
for i := 0; i < n; i++ {
r[i] = maxTime
}
for {
buckets := make([][]int, maxTime+1)
active := false
for i := 0; i < n; i++ {
if l[i] < r[i] {
active = true
mid := (l[i] + r[i]) / 2
buckets[mid] = append(buckets[mid], i)
}
}
if !active {
break
}
reset()
for t := 0; t <= maxTime; t++ {
applyUpdate(t)
for _, i := range buckets[t] {
if check(queries[i]) {
r[i] = t
} else {
l[i] = t + 1
}
}
}
}
return l
}