Offline Binary Search

Offline binary search answers many queries after all updates and queries are already known. It is closely related to parallel binary search. The main idea is to avoid solving each query separately by grouping many midpoint checks into shared passes over the data.

This method applies when each query asks for the earliest or smallest point at which a monotone condition becomes true.

Problem

Given:

a sequence of updates $U_0, U_1, \dots, U_{T-1}$
a set of queries $q_1, q_2, \dots, q_Q$
a monotone predicate $F(q, t)$

Find, for each query $q$ , the smallest time $t$ such that:

F(q, t) = true

If no such time exists, return a sentinel such as $T$ or $-1$ depending on the convention.

Key Idea

Each query keeps a search interval. During each round, all active queries are grouped by midpoint. The update sequence is replayed once, and every query whose midpoint is reached is checked against the current data structure state.

This is offline because the algorithm needs all queries before it starts. It can reorder and batch their checks.

Algorithm

offline_binary_search(queries, updates):
    Q = number of queries
    T = number of updates

    for each query q:
        low[q] = 0
        high[q] = T

    while some query has low[q] < high[q]:
        buckets = array of empty lists of length T

        for each query q:
            if low[q] < high[q]:
                mid = (low[q] + high[q]) // 2
                buckets[mid].append(q)

        reset data structure

        for t from 0 to T - 1:
            apply updates[t]

            for each query q in buckets[t]:
                if check(q):
                    high[q] = t
                else:
                    low[q] = t + 1

    return low

Example

Suppose edges are added to a graph over time. Each query asks when two vertices first become connected.

The predicate is:

F(q, t) = \text{vertices of } q \text{ are connected after updates } 0..t

Once two vertices are connected, they remain connected if edges are only added. Therefore the predicate is monotone.

Offline binary search can answer many such queries by replaying the edge additions once per round, rather than once per query.

Correctness

For each query, the algorithm maintains the invariant that its answer lies in the interval:

[low[q], high[q]]

At each round, the midpoint $mid$ is checked under the exact state after update $mid$ has been applied.

If check(q) is true, monotonicity implies the first true time lies at or before $mid$ , so the algorithm sets:

high[q] = mid

If check(q) is false, monotonicity implies every time at or before $mid$ is false, so the algorithm sets:

low[q] = mid + 1

The interval shrinks for every active query. When the interval has length zero, the remaining value is the earliest feasible time.

Complexity

Let:

$T$ be the number of updates
$Q$ be the number of queries
$C_U$ be the cost of applying one update
$C_Q$ be the cost of checking one query

Each round replays all updates and checks each active query once.

component	cost
rounds	$O(\log T)$
update replay	$O(T C_U \log T)$
query checks	$O(Q C_Q \log T)$
total	$O((T C_U + Q C_Q)\log T)$

Space:

O(T + Q)

Comparison

method	when used	cost shape
independent binary search	each query checked separately	repeated update replay per query
offline binary search	queries known in advance	shared update replay per round
online data structure	queries arrive live	requires dynamic maintenance

Offline binary search is useful when online handling is hard, but the monotone structure is clear.

When to Use

Use offline binary search when:

all queries are known before processing
each answer is a boundary in an ordered domain
the predicate is monotone
updates can be replayed from scratch
batching many midpoint checks saves work

Common examples include connectivity over time, threshold crossing, earliest prefix satisfying a constraint, and cumulative frequency queries.

Notes

The algorithm requires a resettable data structure.
If updates are reversible, divide and conquer on time may be better.
If queries must be answered immediately, this offline method does not apply.
The returned sentinel must be defined carefully when no feasible time exists.

Implementation Sketch

def offline_binary_search(queries, updates, reset, apply_update, check):
    qn = len(queries)
    t = len(updates)

    low = [0] * qn
    high = [t] * qn

    while True:
        active = False
        buckets = [[] for _ in range(t)]

        for qi in range(qn):
            if low[qi] < high[qi]:
                active = True
                mid = (low[qi] + high[qi]) // 2
                if mid < t:
                    buckets[mid].append(qi)

        if not active:
            break

        reset()

        for time, update in enumerate(updates):
            apply_update(update)

            for qi in buckets[time]:
                if check(queries[qi]):
                    high[qi] = time
                else:
                    low[qi] = time + 1

    return low

func OfflineBinarySearch[Q any, U any](
    queries []Q,
    updates []U,
    reset func(),
    applyUpdate func(U),
    check func(Q) bool,
) []int {
    qn := len(queries)
    t := len(updates)

    low := make([]int, qn)
    high := make([]int, qn)
    for i := range high {
        high[i] = t
    }

    for {
        active := false
        buckets := make([][]int, t)

        for qi := 0; qi < qn; qi++ {
            if low[qi] < high[qi] {
                active = true
                mid := (low[qi] + high[qi]) / 2
                if mid < t {
                    buckets[mid] = append(buckets[mid], qi)
                }
            }
        }

        if !active {
            break
        }

        reset()

        for time, update := range updates {
            applyUpdate(update)

            for _, qi := range buckets[time] {
                if check(queries[qi]) {
                    high[qi] = time
                } else {
                    low[qi] = time + 1
                }
            }
        }
    }

    return low
}