11.10 Offline Connectivity

Suppose you are given a graph and thousands or millions of connectivity queries: ```text Are vertices u and v connected?

11.10 Offline Connectivity

Problem

Suppose you are given a graph and thousands or millions of connectivity queries:

Are vertices u and v connected?

A straightforward approach runs DFS or BFS for every query.

Example:

Query 1: connected(A,D)?
Query 2: connected(B,F)?
Query 3: connected(C,E)?
...

If the graph contains:

$$ V = 100,000 $$

vertices and:

$$ Q = 1,000,000 $$

queries, repeatedly traversing the graph becomes extremely expensive.

In many situations, all queries are known before processing begins. Such problems are called offline problems.

When queries are available in advance, union-find can answer them much more efficiently.

Solution

The key observation is simple:

Connectivity only depends on connected components.

If we build the connected components once, every query reduces to comparing two roots.

Instead of performing graph traversal for every query:

DFS
DFS
DFS
DFS
...

we build components once:

Union-Find

and answer queries using:

find(u)
find(v)

Static Connectivity

Consider:

A ----- B ----- C

D ----- E

Queries:

connected(A,C)
connected(A,D)
connected(D,E)

Build components:

{A,B,C}
{D,E}

Using DSU:

union(A,B)
union(B,C)

union(D,E)

Now answer queries.

Query 1

find(A) == find(C)

Result:

true

Query 2

find(A) == find(D)

Result:

false

Query 3

find(D) == find(E)

Result:

true

No graph traversal is required.

Building Components

Given:

Vertices
Edges

Process every edge:

union(u,v)

Example:

A-B
B-C
D-E

Execution:

union(A,B)
union(B,C)
union(D,E)

Resulting components:

{A,B,C}
{D,E}

Every future connectivity query becomes:

find(u) == find(v)

Complexity Improvement

For every query:

DFS/BFS

Cost:

$$ O(V + E) $$

For:

$$ Q $$

queries:

$$ O(Q(V+E)) $$

DSU Approach

Build components:

$$ O(E\alpha(V)) $$

Answer each query:

$$ O(\alpha(V)) $$

Total:

$$ O((E+Q)\alpha(V)) $$

The improvement can be enormous.

Example

Graph:

1 -- 2 -- 3

4 -- 5

6

Build:

union(1,2)
union(2,3)

union(4,5)

Components:

{1,2,3}
{4,5}
{6}

Queries:

Query Answer
1,3 Yes
1,4 No
4,5 Yes
3,6 No
6,6 Yes

Each answer requires only root comparison.

Offline Edge Addition

A common problem provides:

Add edge
Connectivity query
Add edge
Connectivity query
...

Example:

add(A,B)
add(B,C)
query(A,C)

add(D,E)
query(A,E)

DSU handles this naturally.

Step 1

union(A,B)

Step 2

union(B,C)

Query

find(A)==find(C)

Result:

true

Step 3

union(D,E)

Query

find(A)==find(E)

Result:

false

This is sometimes called incremental connectivity.

Connectivity by Edge Weight

Suppose edges have weights:

Edge Weight
A-B 2
B-C 4
C-D 7
D-E 10

Queries:

Are A and D connected using edges ≤ 7?

Offline processing becomes very powerful.

Sort Edges

2
4
7
10

Sort Queries by Threshold

≤ 4
≤ 7
≤ 10

Process edges incrementally.

When threshold reaches:

7

Union:

(A,B)
(B,C)
(C,D)

Now:

find(A)==find(D)

is true.

This technique appears frequently in competitive programming and graph analytics.

Kruskal-Like Offline Processing

Many offline problems follow the same pattern:

  1. Sort events.
  2. Process in order.
  3. Use union-find to merge components.
  4. Answer queries using root comparisons.

This resembles Kruskal's algorithm.

In fact, many offline connectivity algorithms can be viewed as variations of Kruskal.

Example: Earliest Connection Time

Suppose roads open over time.

Time Road
1 A-B
3 B-C
7 C-D
9 D-E

Question:

When do A and D become connected?

Process chronologically:

Time 1

A-B

Not connected.

Time 3

A-B-C

Not connected.

Time 7

A-B-C-D

Connected.

Answer:

7

DSU makes this simulation extremely efficient.

Batch Query Processing

Suppose:

Graph: 1,000,000 edges
Queries: 10,000,000

Repeated DFS becomes impossible.

DSU performs:

Build once
Answer millions of queries

efficiently.

This is why large-scale graph systems often rely heavily on union-find.

Limitations

Union-find can answer:

Connected?
Same component?

very efficiently.

However it cannot answer:

Shortest path?
Distance?
Number of edges on path?

Example:

A -- B -- C

and:

A -------- C

produce the same connectivity result.

DSU ignores graph structure beyond component membership.

Dynamic Edge Deletion

Consider:

add(A,B)
add(B,C)
remove(B,C)

Standard union-find struggles here.

Why?

Because union operations merge components permanently.

Removing edges may require splitting components, which DSU cannot efficiently support.

This limitation motivates dynamic connectivity structures discussed later in the chapter.

Common Mistakes

Rebuilding Components for Every Query

Wrong:

DFS
DFS
DFS

Correct:

Build DSU once
Answer queries

Forgetting Path Compression

Large query volumes require optimized DSU.

Using DSU for Shortest Paths

Connectivity and shortest paths are different problems.

Ignoring Offline Opportunities

Many expensive online problems become simple after sorting events and processing offline.

Real-World Applications

Social Networks

Determining whether users belong to the same connected community.

Communication Systems

Connectivity between network nodes.

Geographic Information Systems

Regional connectivity analysis.

Infrastructure Planning

Road and utility network reachability.

Competitive Programming

Threshold connectivity and dynamic query problems.

Large Graph Analytics

Component computation at massive scale.

Complexity

Build Components

$$ O(E\alpha(V)) $$

Connectivity Query

$$ O(\alpha(V)) $$

Total

$$ O((E+Q)\alpha(V)) $$

For practical purposes:

Nearly linear

in the size of the input.

Recipe

When all connectivity queries are known in advance:

  1. Build a union-find structure.
  2. Process every edge using union().
  3. Compress paths during finds.
  4. Answer each query using root comparison.
  5. Sort events when thresholds or timestamps are involved.
  6. Exploit offline processing whenever possible.

Offline connectivity is one of the most important applications of union-find. It transforms large batches of graph queries from repeated graph traversals into simple root comparisons. The next section extends these ideas further by studying dynamic connectivity, where the graph itself changes over time and components must be maintained efficiently as edges are added and removed.