Answer many monotone queries by processing them together after all input is known.
Offline binary search answers many queries after all updates and queries are already known. It is closely related to parallel binary search. The main idea is to avoid solving each query separately by grouping many midpoint checks into shared passes over the data.
This method applies when each query asks for the earliest or smallest point at which a monotone condition becomes true.
Problem
Given:
- a sequence of updates
- a set of queries
- a monotone predicate
Find, for each query , the smallest time such that:
If no such time exists, return a sentinel such as or depending on the convention.
Key Idea
Each query keeps a search interval. During each round, all active queries are grouped by midpoint. The update sequence is replayed once, and every query whose midpoint is reached is checked against the current data structure state.
This is offline because the algorithm needs all queries before it starts. It can reorder and batch their checks.
Algorithm
offline_binary_search(queries, updates):
Q = number of queries
T = number of updates
for each query q:
low[q] = 0
high[q] = T
while some query has low[q] < high[q]:
buckets = array of empty lists of length T
for each query q:
if low[q] < high[q]:
mid = (low[q] + high[q]) // 2
buckets[mid].append(q)
reset data structure
for t from 0 to T - 1:
apply updates[t]
for each query q in buckets[t]:
if check(q):
high[q] = t
else:
low[q] = t + 1
return lowExample
Suppose edges are added to a graph over time. Each query asks when two vertices first become connected.
The predicate is:
Once two vertices are connected, they remain connected if edges are only added. Therefore the predicate is monotone.
Offline binary search can answer many such queries by replaying the edge additions once per round, rather than once per query.
Correctness
For each query, the algorithm maintains the invariant that its answer lies in the interval:
At each round, the midpoint is checked under the exact state after update has been applied.
If check(q) is true, monotonicity implies the first true time lies at or before , so the algorithm sets:
high[q] = midIf check(q) is false, monotonicity implies every time at or before is false, so the algorithm sets:
low[q] = mid + 1The interval shrinks for every active query. When the interval has length zero, the remaining value is the earliest feasible time.
Complexity
Let:
- be the number of updates
- be the number of queries
- be the cost of applying one update
- be the cost of checking one query
Each round replays all updates and checks each active query once.
| component | cost |
|---|---|
| rounds | |
| update replay | |
| query checks | |
| total |
Space:
Comparison
| method | when used | cost shape |
|---|---|---|
| independent binary search | each query checked separately | repeated update replay per query |
| offline binary search | queries known in advance | shared update replay per round |
| online data structure | queries arrive live | requires dynamic maintenance |
Offline binary search is useful when online handling is hard, but the monotone structure is clear.
When to Use
Use offline binary search when:
- all queries are known before processing
- each answer is a boundary in an ordered domain
- the predicate is monotone
- updates can be replayed from scratch
- batching many midpoint checks saves work
Common examples include connectivity over time, threshold crossing, earliest prefix satisfying a constraint, and cumulative frequency queries.
Notes
- The algorithm requires a resettable data structure.
- If updates are reversible, divide and conquer on time may be better.
- If queries must be answered immediately, this offline method does not apply.
- The returned sentinel must be defined carefully when no feasible time exists.
Implementation Sketch
def offline_binary_search(queries, updates, reset, apply_update, check):
qn = len(queries)
t = len(updates)
low = [0] * qn
high = [t] * qn
while True:
active = False
buckets = [[] for _ in range(t)]
for qi in range(qn):
if low[qi] < high[qi]:
active = True
mid = (low[qi] + high[qi]) // 2
if mid < t:
buckets[mid].append(qi)
if not active:
break
reset()
for time, update in enumerate(updates):
apply_update(update)
for qi in buckets[time]:
if check(queries[qi]):
high[qi] = time
else:
low[qi] = time + 1
return lowfunc OfflineBinarySearch[Q any, U any](
queries []Q,
updates []U,
reset func(),
applyUpdate func(U),
check func(Q) bool,
) []int {
qn := len(queries)
t := len(updates)
low := make([]int, qn)
high := make([]int, qn)
for i := range high {
high[i] = t
}
for {
active := false
buckets := make([][]int, t)
for qi := 0; qi < qn; qi++ {
if low[qi] < high[qi] {
active = true
mid := (low[qi] + high[qi]) / 2
if mid < t {
buckets[mid] = append(buckets[mid], qi)
}
}
}
if !active {
break
}
reset()
for time, update := range updates {
applyUpdate(update)
for _, qi := range buckets[time] {
if check(queries[qi]) {
high[qi] = time
} else {
low[qi] = time + 1
}
}
}
}
return low
}