# Parallel Binary Search with DSU

# Parallel Binary Search with DSU

Parallel binary search with DSU answers many offline connectivity queries over a sequence of edge insertions. It finds the earliest time when each pair of vertices becomes connected.

The technique combines two ideas:

* parallel binary search over answer times
* disjoint set union for fast connectivity checks

It applies when edges are only added. Once two vertices become connected, they remain connected, so the predicate is monotone.

## Problem

Given:

* vertices $0$ to $n - 1$
* edge insertions $E_0, E_1, \dots, E_{T-1}$
* queries $(u, v)$

For each query, find the smallest time $t$ such that $u$ and $v$ are connected after applying edges:

$$
E_0, E_1, \dots, E_t
$$

If they never become connected, return $-1$.

## Key Idea

Each query has an answer interval $[l, r]$ over time.

During each round:

1. Put each active query into a bucket by midpoint
2. Reset the DSU
3. Replay edge insertions from left to right
4. At each time $t$, answer all queries whose midpoint is $t$
5. Shrink each query interval

The DSU is rebuilt once per round, not once per query.

## Algorithm

```id="pbs-dsu-1"
parallel_binary_search_dsu(n, edges, queries):
    T = length(edges)
    Q = length(queries)

    low[q] = 0 for every query q
    high[q] = T for every query q

    while some query has low[q] < high[q]:
        buckets = array of empty lists of length T

        for q from 0 to Q - 1:
            if low[q] < high[q]:
                mid = (low[q] + high[q]) // 2
                if mid < T:
                    buckets[mid].append(q)

        dsu.reset(n)

        for t from 0 to T - 1:
            dsu.union(edges[t].u, edges[t].v)

            for q in buckets[t]:
                u, v = queries[q]
                if dsu.find(u) == dsu.find(v):
                    high[q] = t
                else:
                    low[q] = t + 1

    answer[q] = low[q] if low[q] < T else -1
    return answer
```

## Example

Edges are inserted over time:

| time | edge     |
| ---- | -------- |
| 0    | $(0, 1)$ |
| 1    | $(2, 3)$ |
| 2    | $(1, 2)$ |
| 3    | $(4, 5)$ |

Query:

$$
(0, 3)
$$

Connectivity changes:

| time | connected? |
| ---- | ---------- |
| 0    | no         |
| 1    | no         |
| 2    | yes        |
| 3    | yes        |

The earliest connected time is $2$.

## Correctness

For each query $(u, v)$, define:

$$
F(t) = \text{$u$ and $v$ are connected after edges } 0..t
$$

Because edges are only added, connectivity can change only from false to true. It never changes from true back to false.

Therefore $F(t)$ is monotone.

Each query maintains an interval containing its earliest true time. If the DSU says the endpoints are connected at midpoint $m$, then the earliest true time is at most $m$, so the algorithm moves the right boundary to $m$.

If they are not connected at $m$, monotonicity implies they are not connected at any earlier time, so the algorithm moves the left boundary to $m + 1$.

When the interval collapses, the remaining time is the earliest connection time, or $T$ if no such time exists.

## Complexity

Let:

* $n$ be the number of vertices
* $T$ be the number of edge insertions
* $Q$ be the number of queries
* $\alpha(n)$ be the inverse Ackermann factor from DSU

Each round replays all edges and checks all active queries.

| component    |                        cost |
| ------------ | --------------------------: |
| rounds       |                 $O(\log T)$ |
| DSU unions   |     $O(T \alpha(n) \log T)$ |
| query checks |     $O(Q \alpha(n) \log T)$ |
| total        | $O((T + Q)\alpha(n)\log T)$ |
| space        |              $O(n + T + Q)$ |

## DSU Operations

```id="dsu-pseudo"
make_set(v):
    parent[v] = v
    size[v] = 1

find(v):
    while parent[v] != v:
        parent[v] = parent[parent[v]]
        v = parent[v]
    return v

union(a, b):
    ra = find(a)
    rb = find(b)
    if ra == rb:
        return

    if size[ra] < size[rb]:
        swap(ra, rb)

    parent[rb] = ra
    size[ra] += size[rb]
```

## When to Use

Use this method when:

* all queries are known beforehand
* updates are edge insertions only
* each query asks for earliest connectivity
* the graph grows monotonically
* rebuilding DSU per round is acceptable

## Notes

* This method does not directly support edge deletions.
* For deletions, use rollback DSU with divide and conquer on time.
* If only final connectivity matters, ordinary DSU is enough.
* If queries arrive online, this offline method is not suitable.

## Implementation

```id="py-pbs-dsu"
class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n

    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x

    def union(self, a, b):
        ra = self.find(a)
        rb = self.find(b)
        if ra == rb:
            return
        if self.size[ra] < self.size[rb]:
            ra, rb = rb, ra
        self.parent[rb] = ra
        self.size[ra] += self.size[rb]

def parallel_binary_search_dsu(n, edges, queries):
    t = len(edges)
    q = len(queries)

    low = [0] * q
    high = [t] * q

    while True:
        active = False
        buckets = [[] for _ in range(t)]

        for i in range(q):
            if low[i] < high[i]:
                active = True
                mid = (low[i] + high[i]) // 2
                if mid < t:
                    buckets[mid].append(i)

        if not active:
            break

        dsu = DSU(n)

        for time, (a, b) in enumerate(edges):
            dsu.union(a, b)

            for qi in buckets[time]:
                u, v = queries[qi]
                if dsu.find(u) == dsu.find(v):
                    high[qi] = time
                else:
                    low[qi] = time + 1

    return [low[i] if low[i] < t else -1 for i in range(q)]
```

```id="go-pbs-dsu"
type DSU struct {
    parent []int
    size   []int
}

func NewDSU(n int) *DSU {
    parent := make([]int, n)
    size := make([]int, n)
    for i := 0; i < n; i++ {
        parent[i] = i
        size[i] = 1
    }
    return &DSU{parent: parent, size: size}
}

func (d *DSU) Find(x int) int {
    for d.parent[x] != x {
        d.parent[x] = d.parent[d.parent[x]]
        x = d.parent[x]
    }
    return x
}

func (d *DSU) Union(a, b int) {
    ra := d.Find(a)
    rb := d.Find(b)
    if ra == rb {
        return
    }
    if d.size[ra] < d.size[rb] {
        ra, rb = rb, ra
    }
    d.parent[rb] = ra
    d.size[ra] += d.size[rb]
}

type Edge struct {
    U int
    V int
}

type Query struct {
    U int
    V int
}

func ParallelBinarySearchDSU(n int, edges []Edge, queries []Query) []int {
    t := len(edges)
    q := len(queries)

    low := make([]int, q)
    high := make([]int, q)
    for i := range high {
        high[i] = t
    }

    for {
        active := false
        buckets := make([][]int, t)

        for i := 0; i < q; i++ {
            if low[i] < high[i] {
                active = true
                mid := (low[i] + high[i]) / 2
                if mid < t {
                    buckets[mid] = append(buckets[mid], i)
                }
            }
        }

        if !active {
            break
        }

        dsu := NewDSU(n)

        for time, e := range edges {
            dsu.Union(e.U, e.V)

            for _, qi := range buckets[time] {
                query := queries[qi]
                if dsu.Find(query.U) == dsu.Find(query.V) {
                    high[qi] = time
                } else {
                    low[qi] = time + 1
                }
            }
        }
    }

    ans := make([]int, q)
    for i := 0; i < q; i++ {
        if low[i] < t {
            ans[i] = low[i]
        } else {
            ans[i] = -1
        }
    }

    return ans
}
```

