A clear explanation of finding all elements that appear more than n/3 times using the extended Boyer-Moore voting algorithm.
Problem Restatement
We are given an integer array nums of size n.
We need to return all elements that appear more than:
floor(n / 3)times.
The answer can be returned in any order.
LeetCode examples include nums = [3,2,3], which returns [3], and nums = [1,2], which returns [1,2]. The constraints are 1 <= nums.length <= 5 * 10^4 and -10^9 <= nums[i] <= 10^9. The follow-up asks for linear time and O(1) space.
Input and Output
| Item | Meaning |
|---|---|
| Input | Integer array nums |
| Output | All values appearing more than floor(n / 3) times |
| Return order | Any order is accepted |
| Follow-up target | O(n) time and O(1) extra space |
Function shape:
class Solution:
def majorityElement(self, nums: list[int]) -> list[int]:
...Examples
Example 1:
Input: nums = [3,2,3]
Output: [3]Here:
n = 3
floor(n / 3) = 1The number 3 appears 2 times.
Since 2 > 1, 3 belongs in the answer.
Example 2:
Input: nums = [1]
Output: [1]Here:
n = 1
floor(n / 3) = 0The number 1 appears once.
Since 1 > 0, it belongs in the answer.
Example 3:
Input: nums = [1,2]
Output: [1,2]Here:
n = 2
floor(n / 3) = 0Both 1 and 2 appear once.
Since 1 > 0, both values belong in the answer.
Example 4:
Input: nums = [1,1,1,3,3,2,2,2]
Output: [1,2]Here:
n = 8
floor(n / 3) = 2The number 1 appears 3 times.
The number 2 appears 3 times.
Both are greater than 2.
First Thought: Count Every Number
The simplest solution is to count the frequency of every number with a hash map.
Then return every number whose count is greater than n // 3.
class Solution:
def majorityElement(self, nums: list[int]) -> list[int]:
count = {}
for num in nums:
count[num] = count.get(num, 0) + 1
limit = len(nums) // 3
ans = []
for num, freq in count.items():
if freq > limit:
ans.append(num)
return ansThis is easy to understand and runs in linear time.
Problem With Counting
The hash map may store every distinct number.
For example:
nums = [1,2,3,4,5]The hash map stores all five values.
So the space complexity is:
O(n)The follow-up asks whether we can solve the problem in linear time and constant extra space. That means we should avoid storing all frequencies.
Key Insight
There can be at most two elements that appear more than floor(n / 3) times.
Why?
Suppose three different values all appear more than n / 3 times.
Then together, their total count would be more than:
n / 3 + n / 3 + n / 3 = nThat would require more than n elements, which is impossible.
So we only need to track two possible candidates.
This leads to the extended Boyer-Moore voting algorithm.
Algorithm
We use two candidates and two counters:
cand1 = None
cand2 = None
count1 = 0
count2 = 0The first pass finds possible candidates.
For each num:
- If
num == cand1, incrementcount1. - Else if
num == cand2, incrementcount2. - Else if
count1 == 0, setcand1 = numandcount1 = 1. - Else if
count2 == 0, setcand2 = numandcount2 = 1. - Else decrement both counters.
The decrement step means we found three different values:
cand1, cand2, numOne occurrence of each can be removed from consideration.
This does not destroy the answer, because any true majority element has enough extra occurrences to survive this cancellation process.
After the first pass, cand1 and cand2 are only possible answers.
They are not guaranteed answers.
So we need a second pass to count their real frequencies.
Correctness
The algorithm is based on cancellation.
Whenever we see a number that matches a candidate, we increase that candidate’s counter.
Whenever we see a new number while both candidate slots are occupied, we decrement both counters. Conceptually, this removes one occurrence of three different values from the array:
candidate 1
candidate 2
current numberRemoving one occurrence of three different values cannot remove all occurrences of a value that appears more than n / 3 times.
A true majority-over-third value has too many copies. After all possible cancellations, it must remain as one of the final candidates.
Since there can be at most two valid answers, tracking two candidates is enough.
The first pass only gives possible candidates, because some candidates may survive cancellation without actually appearing more than n / 3 times.
The second pass verifies the real frequencies.
Therefore, every returned value appears more than floor(n / 3) times, and every value that appears more than floor(n / 3) times is returned.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | We scan the array twice |
| Space | O(1) | We store only two candidates and two counters |
The answer list has at most two elements.
Implementation
class Solution:
def majorityElement(self, nums: list[int]) -> list[int]:
cand1 = None
cand2 = None
count1 = 0
count2 = 0
for num in nums:
if cand1 == num:
count1 += 1
elif cand2 == num:
count2 += 1
elif count1 == 0:
cand1 = num
count1 = 1
elif count2 == 0:
cand2 = num
count2 = 1
else:
count1 -= 1
count2 -= 1
count1 = 0
count2 = 0
for num in nums:
if num == cand1:
count1 += 1
elif num == cand2:
count2 += 1
ans = []
limit = len(nums) // 3
if count1 > limit:
ans.append(cand1)
if count2 > limit:
ans.append(cand2)
return ansCode Explanation
We begin with two empty candidate slots:
cand1 = None
cand2 = None
count1 = 0
count2 = 0When the current number matches an existing candidate, we strengthen that candidate:
if cand1 == num:
count1 += 1
elif cand2 == num:
count2 += 1When a candidate slot has no active count, we fill it:
elif count1 == 0:
cand1 = num
count1 = 1
elif count2 == 0:
cand2 = num
count2 = 1If the number matches neither candidate and both counters are active, we cancel one vote from both candidates:
else:
count1 -= 1
count2 -= 1After the first pass, we reset the counts:
count1 = 0
count2 = 0Then we count the true frequency of the candidates:
for num in nums:
if num == cand1:
count1 += 1
elif num == cand2:
count2 += 1Finally, we only return candidates that really appear more than n // 3 times:
if count1 > limit:
ans.append(cand1)
if count2 > limit:
ans.append(cand2)The verification pass is required.
Without it, an array like:
[1,2,3,4]could leave arbitrary candidates even though no value appears more than floor(4 / 3) = 1 times.
Testing
def normalize(values):
return sorted(values)
def run_tests():
s = Solution()
assert normalize(s.majorityElement([3, 2, 3])) == [3]
assert normalize(s.majorityElement([1])) == [1]
assert normalize(s.majorityElement([1, 2])) == [1, 2]
assert normalize(s.majorityElement([1, 1, 1, 3, 3, 2, 2, 2])) == [1, 2]
assert normalize(s.majorityElement([1, 2, 3, 4])) == []
assert normalize(s.majorityElement([2, 2, 1, 3])) == []
assert normalize(s.majorityElement([-1, -1, -1, 2, 3])) == [-1]
print("all tests passed")
run_tests()| Test | Why |
|---|---|
[3,2,3] | Basic majority-over-third case |
[1] | Minimum length |
[1,2] | Both values qualify because threshold is zero |
[1,1,1,3,3,2,2,2] | Two valid answers |
[1,2,3,4] | No answer |
[2,2,1,3] | Exactly n // 3 occurrences do not qualify |
[-1,-1,-1,2,3] | Negative values work normally |