Skip to content

LeetCode 229: Majority Element II

A clear explanation of finding all elements that appear more than n/3 times using the extended Boyer-Moore voting algorithm.

Problem Restatement

We are given an integer array nums of size n.

We need to return all elements that appear more than:

floor(n / 3)

times.

The answer can be returned in any order.

LeetCode examples include nums = [3,2,3], which returns [3], and nums = [1,2], which returns [1,2]. The constraints are 1 <= nums.length <= 5 * 10^4 and -10^9 <= nums[i] <= 10^9. The follow-up asks for linear time and O(1) space.

Input and Output

ItemMeaning
InputInteger array nums
OutputAll values appearing more than floor(n / 3) times
Return orderAny order is accepted
Follow-up targetO(n) time and O(1) extra space

Function shape:

class Solution:
    def majorityElement(self, nums: list[int]) -> list[int]:
        ...

Examples

Example 1:

Input:  nums = [3,2,3]
Output: [3]

Here:

n = 3
floor(n / 3) = 1

The number 3 appears 2 times.

Since 2 > 1, 3 belongs in the answer.

Example 2:

Input:  nums = [1]
Output: [1]

Here:

n = 1
floor(n / 3) = 0

The number 1 appears once.

Since 1 > 0, it belongs in the answer.

Example 3:

Input:  nums = [1,2]
Output: [1,2]

Here:

n = 2
floor(n / 3) = 0

Both 1 and 2 appear once.

Since 1 > 0, both values belong in the answer.

Example 4:

Input:  nums = [1,1,1,3,3,2,2,2]
Output: [1,2]

Here:

n = 8
floor(n / 3) = 2

The number 1 appears 3 times.

The number 2 appears 3 times.

Both are greater than 2.

First Thought: Count Every Number

The simplest solution is to count the frequency of every number with a hash map.

Then return every number whose count is greater than n // 3.

class Solution:
    def majorityElement(self, nums: list[int]) -> list[int]:
        count = {}

        for num in nums:
            count[num] = count.get(num, 0) + 1

        limit = len(nums) // 3
        ans = []

        for num, freq in count.items():
            if freq > limit:
                ans.append(num)

        return ans

This is easy to understand and runs in linear time.

Problem With Counting

The hash map may store every distinct number.

For example:

nums = [1,2,3,4,5]

The hash map stores all five values.

So the space complexity is:

O(n)

The follow-up asks whether we can solve the problem in linear time and constant extra space. That means we should avoid storing all frequencies.

Key Insight

There can be at most two elements that appear more than floor(n / 3) times.

Why?

Suppose three different values all appear more than n / 3 times.

Then together, their total count would be more than:

n / 3 + n / 3 + n / 3 = n

That would require more than n elements, which is impossible.

So we only need to track two possible candidates.

This leads to the extended Boyer-Moore voting algorithm.

Algorithm

We use two candidates and two counters:

cand1 = None
cand2 = None
count1 = 0
count2 = 0

The first pass finds possible candidates.

For each num:

  1. If num == cand1, increment count1.
  2. Else if num == cand2, increment count2.
  3. Else if count1 == 0, set cand1 = num and count1 = 1.
  4. Else if count2 == 0, set cand2 = num and count2 = 1.
  5. Else decrement both counters.

The decrement step means we found three different values:

cand1, cand2, num

One occurrence of each can be removed from consideration.

This does not destroy the answer, because any true majority element has enough extra occurrences to survive this cancellation process.

After the first pass, cand1 and cand2 are only possible answers.

They are not guaranteed answers.

So we need a second pass to count their real frequencies.

Correctness

The algorithm is based on cancellation.

Whenever we see a number that matches a candidate, we increase that candidate’s counter.

Whenever we see a new number while both candidate slots are occupied, we decrement both counters. Conceptually, this removes one occurrence of three different values from the array:

candidate 1
candidate 2
current number

Removing one occurrence of three different values cannot remove all occurrences of a value that appears more than n / 3 times.

A true majority-over-third value has too many copies. After all possible cancellations, it must remain as one of the final candidates.

Since there can be at most two valid answers, tracking two candidates is enough.

The first pass only gives possible candidates, because some candidates may survive cancellation without actually appearing more than n / 3 times.

The second pass verifies the real frequencies.

Therefore, every returned value appears more than floor(n / 3) times, and every value that appears more than floor(n / 3) times is returned.

Complexity

MetricValueWhy
TimeO(n)We scan the array twice
SpaceO(1)We store only two candidates and two counters

The answer list has at most two elements.

Implementation

class Solution:
    def majorityElement(self, nums: list[int]) -> list[int]:
        cand1 = None
        cand2 = None
        count1 = 0
        count2 = 0

        for num in nums:
            if cand1 == num:
                count1 += 1
            elif cand2 == num:
                count2 += 1
            elif count1 == 0:
                cand1 = num
                count1 = 1
            elif count2 == 0:
                cand2 = num
                count2 = 1
            else:
                count1 -= 1
                count2 -= 1

        count1 = 0
        count2 = 0

        for num in nums:
            if num == cand1:
                count1 += 1
            elif num == cand2:
                count2 += 1

        ans = []
        limit = len(nums) // 3

        if count1 > limit:
            ans.append(cand1)

        if count2 > limit:
            ans.append(cand2)

        return ans

Code Explanation

We begin with two empty candidate slots:

cand1 = None
cand2 = None
count1 = 0
count2 = 0

When the current number matches an existing candidate, we strengthen that candidate:

if cand1 == num:
    count1 += 1
elif cand2 == num:
    count2 += 1

When a candidate slot has no active count, we fill it:

elif count1 == 0:
    cand1 = num
    count1 = 1
elif count2 == 0:
    cand2 = num
    count2 = 1

If the number matches neither candidate and both counters are active, we cancel one vote from both candidates:

else:
    count1 -= 1
    count2 -= 1

After the first pass, we reset the counts:

count1 = 0
count2 = 0

Then we count the true frequency of the candidates:

for num in nums:
    if num == cand1:
        count1 += 1
    elif num == cand2:
        count2 += 1

Finally, we only return candidates that really appear more than n // 3 times:

if count1 > limit:
    ans.append(cand1)

if count2 > limit:
    ans.append(cand2)

The verification pass is required.

Without it, an array like:

[1,2,3,4]

could leave arbitrary candidates even though no value appears more than floor(4 / 3) = 1 times.

Testing

def normalize(values):
    return sorted(values)

def run_tests():
    s = Solution()

    assert normalize(s.majorityElement([3, 2, 3])) == [3]
    assert normalize(s.majorityElement([1])) == [1]
    assert normalize(s.majorityElement([1, 2])) == [1, 2]
    assert normalize(s.majorityElement([1, 1, 1, 3, 3, 2, 2, 2])) == [1, 2]
    assert normalize(s.majorityElement([1, 2, 3, 4])) == []
    assert normalize(s.majorityElement([2, 2, 1, 3])) == []
    assert normalize(s.majorityElement([-1, -1, -1, 2, 3])) == [-1]

    print("all tests passed")

run_tests()
TestWhy
[3,2,3]Basic majority-over-third case
[1]Minimum length
[1,2]Both values qualify because threshold is zero
[1,1,1,3,3,2,2,2]Two valid answers
[1,2,3,4]No answer
[2,2,1,3]Exactly n // 3 occurrences do not qualify
[-1,-1,-1,2,3]Negative values work normally