11.10 Offline Connectivity
Suppose you are given a graph and thousands or millions of connectivity queries: ```text Are vertices u and v connected?
11.10 Offline Connectivity
Problem
Suppose you are given a graph and thousands or millions of connectivity queries:
Are vertices u and v connected?
A straightforward approach runs DFS or BFS for every query.
Example:
Query 1: connected(A,D)?
Query 2: connected(B,F)?
Query 3: connected(C,E)?
...
If the graph contains:
$$ V = 100,000 $$
vertices and:
$$ Q = 1,000,000 $$
queries, repeatedly traversing the graph becomes extremely expensive.
In many situations, all queries are known before processing begins. Such problems are called offline problems.
When queries are available in advance, union-find can answer them much more efficiently.
Solution
The key observation is simple:
Connectivity only depends on connected components.
If we build the connected components once, every query reduces to comparing two roots.
Instead of performing graph traversal for every query:
DFS
DFS
DFS
DFS
...
we build components once:
Union-Find
and answer queries using:
find(u)
find(v)
Static Connectivity
Consider:
A ----- B ----- C
D ----- E
Queries:
connected(A,C)
connected(A,D)
connected(D,E)
Build components:
{A,B,C}
{D,E}
Using DSU:
union(A,B)
union(B,C)
union(D,E)
Now answer queries.
Query 1
find(A) == find(C)
Result:
true
Query 2
find(A) == find(D)
Result:
false
Query 3
find(D) == find(E)
Result:
true
No graph traversal is required.
Building Components
Given:
Vertices
Edges
Process every edge:
union(u,v)
Example:
A-B
B-C
D-E
Execution:
union(A,B)
union(B,C)
union(D,E)
Resulting components:
{A,B,C}
{D,E}
Every future connectivity query becomes:
find(u) == find(v)
Complexity Improvement
Naïve Approach
For every query:
DFS/BFS
Cost:
$$ O(V + E) $$
For:
$$ Q $$
queries:
$$ O(Q(V+E)) $$
DSU Approach
Build components:
$$ O(E\alpha(V)) $$
Answer each query:
$$ O(\alpha(V)) $$
Total:
$$ O((E+Q)\alpha(V)) $$
The improvement can be enormous.
Example
Graph:
1 -- 2 -- 3
4 -- 5
6
Build:
union(1,2)
union(2,3)
union(4,5)
Components:
{1,2,3}
{4,5}
{6}
Queries:
| Query | Answer |
|---|---|
| 1,3 | Yes |
| 1,4 | No |
| 4,5 | Yes |
| 3,6 | No |
| 6,6 | Yes |
Each answer requires only root comparison.
Offline Edge Addition
A common problem provides:
Add edge
Connectivity query
Add edge
Connectivity query
...
Example:
add(A,B)
add(B,C)
query(A,C)
add(D,E)
query(A,E)
DSU handles this naturally.
Step 1
union(A,B)
Step 2
union(B,C)
Query
find(A)==find(C)
Result:
true
Step 3
union(D,E)
Query
find(A)==find(E)
Result:
false
This is sometimes called incremental connectivity.
Connectivity by Edge Weight
Suppose edges have weights:
| Edge | Weight |
|---|---|
| A-B | 2 |
| B-C | 4 |
| C-D | 7 |
| D-E | 10 |
Queries:
Are A and D connected using edges ≤ 7?
Offline processing becomes very powerful.
Sort Edges
2
4
7
10
Sort Queries by Threshold
≤ 4
≤ 7
≤ 10
Process edges incrementally.
When threshold reaches:
7
Union:
(A,B)
(B,C)
(C,D)
Now:
find(A)==find(D)
is true.
This technique appears frequently in competitive programming and graph analytics.
Kruskal-Like Offline Processing
Many offline problems follow the same pattern:
- Sort events.
- Process in order.
- Use union-find to merge components.
- Answer queries using root comparisons.
This resembles Kruskal's algorithm.
In fact, many offline connectivity algorithms can be viewed as variations of Kruskal.
Example: Earliest Connection Time
Suppose roads open over time.
| Time | Road |
|---|---|
| 1 | A-B |
| 3 | B-C |
| 7 | C-D |
| 9 | D-E |
Question:
When do A and D become connected?
Process chronologically:
Time 1
A-B
Not connected.
Time 3
A-B-C
Not connected.
Time 7
A-B-C-D
Connected.
Answer:
7
DSU makes this simulation extremely efficient.
Batch Query Processing
Suppose:
Graph: 1,000,000 edges
Queries: 10,000,000
Repeated DFS becomes impossible.
DSU performs:
Build once
Answer millions of queries
efficiently.
This is why large-scale graph systems often rely heavily on union-find.
Limitations
Union-find can answer:
Connected?
Same component?
very efficiently.
However it cannot answer:
Shortest path?
Distance?
Number of edges on path?
Example:
A -- B -- C
and:
A -------- C
produce the same connectivity result.
DSU ignores graph structure beyond component membership.
Dynamic Edge Deletion
Consider:
add(A,B)
add(B,C)
remove(B,C)
Standard union-find struggles here.
Why?
Because union operations merge components permanently.
Removing edges may require splitting components, which DSU cannot efficiently support.
This limitation motivates dynamic connectivity structures discussed later in the chapter.
Common Mistakes
Rebuilding Components for Every Query
Wrong:
DFS
DFS
DFS
Correct:
Build DSU once
Answer queries
Forgetting Path Compression
Large query volumes require optimized DSU.
Using DSU for Shortest Paths
Connectivity and shortest paths are different problems.
Ignoring Offline Opportunities
Many expensive online problems become simple after sorting events and processing offline.
Real-World Applications
Social Networks
Determining whether users belong to the same connected community.
Communication Systems
Connectivity between network nodes.
Geographic Information Systems
Regional connectivity analysis.
Infrastructure Planning
Road and utility network reachability.
Competitive Programming
Threshold connectivity and dynamic query problems.
Large Graph Analytics
Component computation at massive scale.
Complexity
Build Components
$$ O(E\alpha(V)) $$
Connectivity Query
$$ O(\alpha(V)) $$
Total
$$ O((E+Q)\alpha(V)) $$
For practical purposes:
Nearly linear
in the size of the input.
Recipe
When all connectivity queries are known in advance:
- Build a union-find structure.
- Process every edge using
union(). - Compress paths during finds.
- Answer each query using root comparison.
- Sort events when thresholds or timestamps are involved.
- Exploit offline processing whenever possible.
Offline connectivity is one of the most important applications of union-find. It transforms large batches of graph queries from repeated graph traversals into simple root comparisons. The next section extends these ideas further by studying dynamic connectivity, where the graph itself changes over time and components must be maintained efficiently as edges are added and removed.