Answer offline connectivity queries by combining parallel binary search with a disjoint set union.
Parallel binary search with DSU answers many offline connectivity queries over a sequence of edge insertions. It finds the earliest time when each pair of vertices becomes connected.
The technique combines two ideas:
- parallel binary search over answer times
- disjoint set union for fast connectivity checks
It applies when edges are only added. Once two vertices become connected, they remain connected, so the predicate is monotone.
Problem
Given:
- vertices to
- edge insertions
- queries
For each query, find the smallest time such that and are connected after applying edges:
If they never become connected, return .
Key Idea
Each query has an answer interval over time.
During each round:
- Put each active query into a bucket by midpoint
- Reset the DSU
- Replay edge insertions from left to right
- At each time , answer all queries whose midpoint is
- Shrink each query interval
The DSU is rebuilt once per round, not once per query.
Algorithm
parallel_binary_search_dsu(n, edges, queries):
T = length(edges)
Q = length(queries)
low[q] = 0 for every query q
high[q] = T for every query q
while some query has low[q] < high[q]:
buckets = array of empty lists of length T
for q from 0 to Q - 1:
if low[q] < high[q]:
mid = (low[q] + high[q]) // 2
if mid < T:
buckets[mid].append(q)
dsu.reset(n)
for t from 0 to T - 1:
dsu.union(edges[t].u, edges[t].v)
for q in buckets[t]:
u, v = queries[q]
if dsu.find(u) == dsu.find(v):
high[q] = t
else:
low[q] = t + 1
answer[q] = low[q] if low[q] < T else -1
return answerExample
Edges are inserted over time:
| time | edge |
|---|---|
| 0 | |
| 1 | |
| 2 | |
| 3 |
Query:
Connectivity changes:
| time | connected? |
|---|---|
| 0 | no |
| 1 | no |
| 2 | yes |
| 3 | yes |
The earliest connected time is .
Correctness
For each query , define:
Because edges are only added, connectivity can change only from false to true. It never changes from true back to false.
Therefore is monotone.
Each query maintains an interval containing its earliest true time. If the DSU says the endpoints are connected at midpoint , then the earliest true time is at most , so the algorithm moves the right boundary to .
If they are not connected at , monotonicity implies they are not connected at any earlier time, so the algorithm moves the left boundary to .
When the interval collapses, the remaining time is the earliest connection time, or if no such time exists.
Complexity
Let:
- be the number of vertices
- be the number of edge insertions
- be the number of queries
- be the inverse Ackermann factor from DSU
Each round replays all edges and checks all active queries.
| component | cost |
|---|---|
| rounds | |
| DSU unions | |
| query checks | |
| total | |
| space |
DSU Operations
make_set(v):
parent[v] = v
size[v] = 1
find(v):
while parent[v] != v:
parent[v] = parent[parent[v]]
v = parent[v]
return v
union(a, b):
ra = find(a)
rb = find(b)
if ra == rb:
return
if size[ra] < size[rb]:
swap(ra, rb)
parent[rb] = ra
size[ra] += size[rb]When to Use
Use this method when:
- all queries are known beforehand
- updates are edge insertions only
- each query asks for earliest connectivity
- the graph grows monotonically
- rebuilding DSU per round is acceptable
Notes
- This method does not directly support edge deletions.
- For deletions, use rollback DSU with divide and conquer on time.
- If only final connectivity matters, ordinary DSU is enough.
- If queries arrive online, this offline method is not suitable.
Implementation
class DSU:
def __init__(self, n):
self.parent = list(range(n))
self.size = [1] * n
def find(self, x):
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]]
x = self.parent[x]
return x
def union(self, a, b):
ra = self.find(a)
rb = self.find(b)
if ra == rb:
return
if self.size[ra] < self.size[rb]:
ra, rb = rb, ra
self.parent[rb] = ra
self.size[ra] += self.size[rb]
def parallel_binary_search_dsu(n, edges, queries):
t = len(edges)
q = len(queries)
low = [0] * q
high = [t] * q
while True:
active = False
buckets = [[] for _ in range(t)]
for i in range(q):
if low[i] < high[i]:
active = True
mid = (low[i] + high[i]) // 2
if mid < t:
buckets[mid].append(i)
if not active:
break
dsu = DSU(n)
for time, (a, b) in enumerate(edges):
dsu.union(a, b)
for qi in buckets[time]:
u, v = queries[qi]
if dsu.find(u) == dsu.find(v):
high[qi] = time
else:
low[qi] = time + 1
return [low[i] if low[i] < t else -1 for i in range(q)]type DSU struct {
parent []int
size []int
}
func NewDSU(n int) *DSU {
parent := make([]int, n)
size := make([]int, n)
for i := 0; i < n; i++ {
parent[i] = i
size[i] = 1
}
return &DSU{parent: parent, size: size}
}
func (d *DSU) Find(x int) int {
for d.parent[x] != x {
d.parent[x] = d.parent[d.parent[x]]
x = d.parent[x]
}
return x
}
func (d *DSU) Union(a, b int) {
ra := d.Find(a)
rb := d.Find(b)
if ra == rb {
return
}
if d.size[ra] < d.size[rb] {
ra, rb = rb, ra
}
d.parent[rb] = ra
d.size[ra] += d.size[rb]
}
type Edge struct {
U int
V int
}
type Query struct {
U int
V int
}
func ParallelBinarySearchDSU(n int, edges []Edge, queries []Query) []int {
t := len(edges)
q := len(queries)
low := make([]int, q)
high := make([]int, q)
for i := range high {
high[i] = t
}
for {
active := false
buckets := make([][]int, t)
for i := 0; i < q; i++ {
if low[i] < high[i] {
active = true
mid := (low[i] + high[i]) / 2
if mid < t {
buckets[mid] = append(buckets[mid], i)
}
}
}
if !active {
break
}
dsu := NewDSU(n)
for time, e := range edges {
dsu.Union(e.U, e.V)
for _, qi := range buckets[time] {
query := queries[qi]
if dsu.Find(query.U) == dsu.Find(query.V) {
high[qi] = time
} else {
low[qi] = time + 1
}
}
}
}
ans := make([]int, q)
for i := 0; i < q; i++ {
if low[i] < t {
ans[i] = low[i]
} else {
ans[i] = -1
}
}
return ans
}