LeetCode 347: Top K Frequent Elements

Problem Restatement

We are given an integer array nums and an integer k.

Return the k most frequent elements in the array.

The answer may be returned in any order. The problem guarantees that the answer is unique. The constraints allow nums.length up to 10^5, so the solution must be better than O(n^2).

Input and Output

Item	Meaning
Input	Integer array `nums` and integer `k`
Output	The `k` most frequent elements
Order	Any order is accepted
Guarantee	The answer is unique
Constraint	`1 <= k <= number of unique elements`

Function shape:

def topKFrequent(nums: list[int], k: int) -> list[int]:
    ...

Examples

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Frequencies:

Number	Count
`1`	`3`
`2`	`2`
`3`	`1`

The two most frequent elements are 1 and 2.

Example 2:

Input: nums = [1], k = 1
Output: [1]

There is only one unique element, so it must be returned.

Example 3:

Input: nums = [1,2,1,2,1,2,3,1,3,2], k = 2
Output: [1,2]

Frequencies:

Number	Count
`1`	`4`
`2`	`4`
`3`	`2`

The top two elements are 1 and 2.

First Thought: Count and Sort

The most direct approach is:

Count how many times each number appears.
Sort the numbers by frequency.
Take the first k.

from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        count = Counter(nums)

        items = sorted(
            count.items(),
            key=lambda item: item[1],
            reverse=True,
        )

        return [num for num, freq in items[:k]]

This is correct and simple.

But sorting all unique numbers costs:

O(m log m)

where m is the number of unique elements.

We can do better with bucket sort.

Key Insight

A frequency cannot be larger than len(nums).

So instead of sorting by frequency, create buckets.

Each bucket index represents a frequency.

bucket[f] = all numbers that appear exactly f times

For example:

nums = [1,1,1,2,2,3]

Frequencies:

1 -> 3
2 -> 2
3 -> 1

Buckets:

bucket[1] = [3]
bucket[2] = [2]
bucket[3] = [1]

Then scan buckets from high frequency to low frequency.

Collect numbers until we have k elements.

Algorithm

Count frequencies using a hash map.
Create len(nums) + 1 empty buckets.
For each (num, freq):
1. Put num into buckets[freq].
Scan freq from len(nums) down to 1.
Add numbers from each bucket into the answer.
Stop once the answer has k elements.

Walkthrough

Use:

nums = [1,1,1,2,2,3]
k = 2

Count frequencies:

1 -> 3
2 -> 2
3 -> 1

Create buckets:

buckets[0] = []
buckets[1] = [3]
buckets[2] = [2]
buckets[3] = [1]
buckets[4] = []
buckets[5] = []
buckets[6] = []

Scan from high to low.

At frequency 3, find:

[1]

Answer becomes:

[1]

At frequency 2, find:

[2]

Answer becomes:

[1, 2]

Now we have k = 2 elements, so return.

Correctness

The frequency map stores the exact number of times each unique value appears in nums.

Each number is placed into the bucket matching its frequency. Therefore, buckets[f] contains exactly the numbers whose frequency is f.

The algorithm scans buckets from the largest possible frequency down to the smallest. This means every number collected earlier has frequency greater than or equal to every number collected later.

Since the answer is unique and we stop after collecting k numbers, the returned list contains exactly the k most frequent elements.

The output order does not matter, so the order in which elements are collected from the buckets is acceptable.

Complexity

Let n = len(nums).

Metric	Value	Why
Time	`O(n)`	Count once, bucket once, scan buckets once
Space	`O(n)`	Frequency map and buckets

The bucket array has length n + 1.

Implementation

from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        count = Counter(nums)

        buckets = [[] for _ in range(len(nums) + 1)]

        for num, freq in count.items():
            buckets[freq].append(num)

        ans = []

        for freq in range(len(buckets) - 1, 0, -1):
            for num in buckets[freq]:
                ans.append(num)

                if len(ans) == k:
                    return ans

        return ans

Code Explanation

Count each number:

count = Counter(nums)

Create buckets by possible frequency:

buckets = [[] for _ in range(len(nums) + 1)]

Place each number into its frequency bucket:

for num, freq in count.items():
    buckets[freq].append(num)

Scan from highest frequency to lowest:

for freq in range(len(buckets) - 1, 0, -1):

Append numbers until we have k elements:

ans.append(num)

if len(ans) == k:
    return ans

The final return ans is only a safety return. The problem guarantees that k is valid.

Heap Alternative

A heap solution is also common.

It keeps only k elements in a min-heap. This is useful when the number of unique elements is large and k is small.

Its complexity is:

Metric	Value
Time	`O(n log k)`
Space	`O(n + k)`

Bucket sort is simpler here because frequencies are bounded by n.

Testing

Because the answer can be returned in any order, tests should compare sets.

def run_tests():
    s = Solution()

    assert set(s.topKFrequent([1, 1, 1, 2, 2, 3], 2)) == {1, 2}
    assert set(s.topKFrequent([1], 1)) == {1}
    assert set(s.topKFrequent([1, 2, 1, 2, 1, 2, 3, 1, 3, 2], 2)) == {1, 2}

    assert set(s.topKFrequent([4, 4, 4, 6, 6, 7], 1)) == {4}
    assert set(s.topKFrequent([-1, -1, -2, -2, -2, 3], 2)) == {-2, -1}
    assert set(s.topKFrequent([5, 3, 1, 1, 1, 3, 73, 1], 2)) == {1, 3}

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`[1,1,1,2,2,3]`, `2`	Standard sample
`[1]`, `1`	Single element
Equal top frequencies	Unique top-k set still exists
`k = 1`	Only most frequent element
Negative numbers	Hash map handles signed values
Mixed values	Checks general counting behavior