A clear explanation of solving Largest Component Size by Common Factor using prime factorization and union find.
Problem Restatement
We are given an array nums of unique positive integers.
Each number is a node in a graph.
There is an undirected edge between two numbers if they share a common factor greater than 1.
Return the size of the largest connected component in this graph.
For example, 6 and 15 are connected because they share factor 3.
The constraints are:
| Constraint | Value |
|---|---|
nums.length | 1 <= nums.length <= 2 * 10^4 |
nums[i] | 1 <= nums[i] <= 10^5 |
| Values | All values in nums are unique |
Source: LeetCode problem statement.
Input and Output
| Item | Meaning |
|---|---|
| Input | An array nums of unique positive integers |
| Output | The size of the largest connected component |
| Edge rule | Two numbers are connected if their greatest common divisor is greater than 1 |
Example function shape:
def largestComponentSize(nums: list[int]) -> int:
...Examples
Example 1:
nums = [4, 6, 15, 35]Output:
4Explanation:
4 connects to 6 through factor 2.
6 connects to 15 through factor 3.
15 connects to 35 through factor 5.
So all four numbers belong to one connected component.
Example 2:
nums = [20, 50, 9, 63]Output:
2Explanation:
20 and 50 are connected.
9 and 63 are connected.
There are two components of size 2.
Example 3:
nums = [2, 3, 6, 7, 4, 12, 21, 39]Output:
8Every number is connected through a chain of shared factors, so the largest component contains all 8 numbers.
First Thought: Compare Every Pair
A direct graph-building solution would compare every pair of numbers.
For each pair (nums[i], nums[j]), compute:
gcd(nums[i], nums[j])If the result is greater than 1, connect them.
Code idea:
for i in range(n):
for j in range(i + 1, n):
if gcd(nums[i], nums[j]) > 1:
union(i, j)This is simple, but too slow.
There can be up to 2 * 10^4 numbers, so pair comparison gives about:
O(n^2)That is too many checks.
Key Insight
Two numbers are connected if they share a factor greater than 1.
Instead of comparing numbers against each other, connect each number to its prime factors.
For example:
12 = 2 * 2 * 3
18 = 2 * 3 * 3Both contain prime factor 2.
Both also contain prime factor 3.
So they must belong to the same component.
We can use union find:
number <-> prime factorIf two numbers share the same prime factor, they get connected through that factor.
Example:
4 -> 2
6 -> 2, 3
15 -> 3, 5
35 -> 5, 7This creates one chain:
4 -- 2 -- 6 -- 3 -- 15 -- 5 -- 35So the component size is 4.
Algorithm
Use disjoint set union over values from 0 to max(nums).
For each number x in nums:
- Factorize
x. - For every prime factor
pofx, unionxwithp.
After processing all numbers:
- Find the root of each original number.
- Count how many original numbers belong to each root.
- Return the largest count.
We count only original numbers from nums, not factor nodes.
Factorization
To factorize x, test divisors from 2 while d * d <= x.
If d divides x, then d is a prime factor after repeated division.
We add d, then divide out all copies of d.
After the loop, if the remaining value is greater than 1, it is also a prime factor.
Example:
x = 84Prime factorization:
84 = 2 * 2 * 3 * 7The unique prime factors are:
[2, 3, 7]We only need unique prime factors. Repeated factors do not change connectivity.
Correctness
We need to show that the algorithm returns the size of the largest connected component in the graph.
The original graph connects two numbers if they share a common factor greater than 1.
Any common factor greater than 1 has at least one prime factor. Therefore, if two numbers share any factor greater than 1, they also share some prime factor.
During the algorithm, each number is unioned with each of its prime factors. So if two numbers share a prime factor p, both numbers are unioned with p. Therefore, they end up in the same union find component.
This proves every edge in the original graph is represented by the union find structure.
Now consider the reverse direction. If two numbers become connected in union find, then there is a chain alternating between numbers and prime factors. Each step from a number to a prime factor means that the prime factor divides the number. Therefore, each adjacent pair of numbers in the implied chain shares a prime factor, which is a common factor greater than 1.
So union find connectivity matches graph connectivity.
Finally, the algorithm counts only the original numbers in nums by their union find root. Since each union find group corresponds exactly to one connected component of the graph, the largest count is the size of the largest connected component.
Complexity
Let:
n = len(nums)
m = max(nums)| Metric | Value | Why |
|---|---|---|
| Time | O(n * sqrt(m) * alpha(m)) | Each number is factorized by trial division, and each union/find is nearly constant |
| Space | O(m) | Union find arrays are sized up to max(nums) |
Here, alpha(m) is the inverse Ackermann function. For practical input sizes, it behaves like a very small constant.
Since nums[i] <= 10^5, an O(max(nums)) union find array is acceptable.
Implementation
from collections import Counter
class DSU:
def __init__(self, n: int):
self.parent = list(range(n))
self.size = [1] * n
def find(self, x: int) -> int:
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]]
x = self.parent[x]
return x
def union(self, a: int, b: int) -> None:
root_a = self.find(a)
root_b = self.find(b)
if root_a == root_b:
return
if self.size[root_a] < self.size[root_b]:
root_a, root_b = root_b, root_a
self.parent[root_b] = root_a
self.size[root_a] += self.size[root_b]
class Solution:
def largestComponentSize(self, nums: list[int]) -> int:
max_num = max(nums)
dsu = DSU(max_num + 1)
for x in nums:
value = x
factor = 2
while factor * factor <= value:
if value % factor == 0:
dsu.union(x, factor)
while value % factor == 0:
value //= factor
factor += 1
if value > 1:
dsu.union(x, value)
count = Counter()
for x in nums:
root = dsu.find(x)
count[root] += 1
return max(count.values())Code Explanation
We create a union find structure large enough to contain every number and every possible factor:
max_num = max(nums)
dsu = DSU(max_num + 1)For each number, we factorize it:
for x in nums:
value = x
factor = 2The variable value is a working copy. We divide it during factorization while keeping x unchanged for union operations.
When we find a factor:
if value % factor == 0:
dsu.union(x, factor)This connects the original number to that factor.
Then we remove all repeated copies:
while value % factor == 0:
value //= factorFor example, if x = 72, we only need to union with 2 once, even though 72 contains multiple copies of 2.
After the loop, if value > 1, the remaining value is prime:
if value > 1:
dsu.union(x, value)Then we count component sizes among the original input numbers:
count = Counter()
for x in nums:
root = dsu.find(x)
count[root] += 1The factor nodes are only connectors. They are not counted as graph nodes.
Finally:
return max(count.values())This returns the largest connected component size.
Testing
def run_tests():
s = Solution()
assert s.largestComponentSize([4, 6, 15, 35]) == 4
assert s.largestComponentSize([20, 50, 9, 63]) == 2
assert s.largestComponentSize([2, 3, 6, 7, 4, 12, 21, 39]) == 8
assert s.largestComponentSize([1]) == 1
assert s.largestComponentSize([2, 3, 5, 7]) == 1
assert s.largestComponentSize([6, 10, 15]) == 3
assert s.largestComponentSize([14, 28, 49, 35]) == 4
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
[4, 6, 15, 35] | All numbers connect through factor chain |
[20, 50, 9, 63] | Two separate components of size 2 |
[2, 3, 6, 7, 4, 12, 21, 39] | Full graph becomes one component |
[1] | Minimum input |
| Prime-only list | No edges between distinct primes |
[6, 10, 15] | All connected pairwise through different factors |
[14, 28, 49, 35] | Chain connection through factor 7 |