Find the extra edge in an undirected graph that creates a cycle using Union-Find.
Problem Restatement
We are given an undirected graph.
The graph originally started as a tree with n nodes:
| Property | Meaning |
|---|---|
| Connected | Every node can be reached |
| No cycles | Exactly one path between any two nodes |
Then one extra edge was added.
The input edges contains all edges of the resulting graph.
We must return the edge that can be removed so the graph becomes a tree again.
If multiple answers are possible, return the one that appears last in the input. The official problem states that the graph started as a tree and then had one additional edge added between two different nodes. (leetcode.com)
Input and Output
| Item | Meaning |
|---|---|
| Input | edges, a list of undirected edges |
| Output | The redundant edge |
| Node labels | 1 through n |
| Graph type | Undirected |
| Extra edge count | Exactly one |
Example function shape:
def findRedundantConnection(edges: list[list[int]]) -> list[int]:
...Examples
Example 1:
edges = [[1,2],[1,3],[2,3]]Graph:
1
| \
| \
2---3The edge:
[2, 3]creates a cycle.
Answer:
[2, 3]Example 2:
edges = [[1,2],[2,3],[3,4],[1,4],[1,5]]Graph:
1---2
| |
5 3
\ |
4The edge:
[1, 4]creates the cycle:
1 -> 2 -> 3 -> 4 -> 1Answer:
[1, 4]First Thought: Detect the Cycle
The redundant edge is exactly the edge that creates a cycle.
Suppose we process edges one by one.
Before adding an edge (u, v):
- if
uandvare already connected, - then adding this edge creates a cycle.
That edge is redundant.
So the problem becomes:
How do we efficiently check whether two nodes are already connected?A graph traversal like DFS or BFS could work, but there is a cleaner structure designed for exactly this problem:
Union-FindKey Insight
Union-Find, also called Disjoint Set Union (DSU), maintains groups of connected nodes.
It supports two operations:
| Operation | Meaning |
|---|---|
find(x) | Return the representative of node x’s connected component |
union(a, b) | Merge the components of a and b |
If two nodes already have the same representative:
find(u) == find(v)then they are already connected.
Adding an edge between them would create a cycle.
That edge is the answer.
Union-Find Structure
We maintain a parent array:
parent[i]Initially, every node is its own parent:
1 2 3 4
| | | |
1 2 3 4Each node starts in its own component.
When we connect two components, we merge their roots.
Algorithm
- Create a Union-Find structure for nodes
1throughn. - Process each edge
[u, v]:- If
find(u) == find(v), return[u, v]. - Otherwise, union their components.
- If
- Since the problem guarantees one redundant edge, we will always return an answer.
Walkthrough
Consider:
edges = [[1,2],[1,3],[2,3]]Initially:
| Node | Parent |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
Process [1,2]:
1and2are in different components.- Union them.
Now:
1 -- 2
3Process [1,3]:
1and3are different.- Union them.
Now:
1 -- 2
|
3Process [2,3]:
2and3are already connected through1.- Adding
[2,3]creates a cycle.
Return:
[2, 3]Path Compression
A basic Union-Find works, but repeated parent traversals can become long.
Path compression improves performance.
Suppose:
1 -> 2 -> 3 -> 4Calling:
find(1)walks all the way to 4.
After path compression, we rewrite the chain:
1 -> 4
2 -> 4
3 -> 4
4 -> 4Future lookups become much faster.
The compressed find function is:
def find(x):
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]Correctness
Initially, every node is in its own connected component.
The Union-Find structure maintains the invariant that two nodes belong to the same component exactly when there exists a path between them using previously processed edges.
When processing an edge [u, v], there are two cases.
If find(u) != find(v), then u and v are in different connected components. Adding the edge connects these two components without creating a cycle, so the algorithm unions them.
If find(u) == find(v), then there is already a path between u and v using earlier edges. Adding another edge directly between them creates a cycle. Therefore, [u, v] is redundant.
The algorithm returns the first edge that creates a cycle while scanning from left to right. Since the graph started as a tree plus one extra edge, exactly one edge creates a cycle, and the returned edge is the correct answer.
If multiple removable edges exist within the cycle, the problem asks for the one appearing last in the input. The first edge detected by Union-Find as redundant while processing the final graph edges in order is exactly that edge.
Therefore, the algorithm is correct.
Complexity
Let n be the number of nodes.
Union-Find with path compression has nearly constant-time operations.
| Metric | Value | Why |
|---|---|---|
| Time | O(n * α(n)) | Each find and union is almost constant |
| Space | O(n) | Parent array |
α(n) is the inverse Ackermann function, which grows extremely slowly.
In practice, this behaves like:
O(n)Implementation
class Solution:
def findRedundantConnection(self, edges: list[list[int]]) -> list[int]:
n = len(edges)
parent = list(range(n + 1))
def find(x: int) -> int:
if parent[x] != x:
parent[x] = find(parent[x])
return parent[x]
def union(a: int, b: int) -> bool:
root_a = find(a)
root_b = find(b)
if root_a == root_b:
return False
parent[root_a] = root_b
return True
for u, v in edges:
if not union(u, v):
return [u, v]
return []Code Explanation
We create the parent array:
parent = list(range(n + 1))Each node initially points to itself.
The find function returns the root representative:
def find(x: int) -> int:If the node is not its own parent:
if parent[x] != x:we recursively move upward and compress the path:
parent[x] = find(parent[x])The union function tries to merge two components:
root_a = find(a)
root_b = find(b)If the roots are already equal:
if root_a == root_b:
return Falsethen adding this edge would create a cycle.
Otherwise:
parent[root_a] = root_bmerges the two components.
In the main loop:
for u, v in edges:we try to union every edge.
The first failed union is the redundant edge:
if not union(u, v):
return [u, v]Testing
def run_tests():
s = Solution()
assert s.findRedundantConnection(
[[1,2],[1,3],[2,3]]
) == [2,3]
assert s.findRedundantConnection(
[[1,2],[2,3],[3,4],[1,4],[1,5]]
) == [1,4]
assert s.findRedundantConnection(
[[1,2],[2,3],[3,1]]
) == [3,1]
assert s.findRedundantConnection(
[[1,2],[2,3],[3,4],[4,5],[2,5]]
) == [2,5]
print("all tests passed")
run_tests()Test meaning:
| Test | Expected | Why |
|---|---|---|
[[1,2],[1,3],[2,3]] | [2,3] | Small triangle cycle |
[[1,2],[2,3],[3,4],[1,4],[1,5]] | [1,4] | Official example |
[[1,2],[2,3],[3,1]] | [3,1] | Three-node cycle |
[[1,2],[2,3],[3,4],[4,5],[2,5]] | [2,5] | Larger chain with closing edge |