Skip to content

LeetCode 347: Top K Frequent Elements

A clear explanation of Top K Frequent Elements using frequency counting and bucket sort.

Problem Restatement

We are given an integer array nums and an integer k.

Return the k most frequent elements in the array.

The answer may be returned in any order. The problem guarantees that the answer is unique. The constraints allow nums.length up to 10^5, so the solution must be better than O(n^2).

Input and Output

ItemMeaning
InputInteger array nums and integer k
OutputThe k most frequent elements
OrderAny order is accepted
GuaranteeThe answer is unique
Constraint1 <= k <= number of unique elements

Function shape:

def topKFrequent(nums: list[int], k: int) -> list[int]:
    ...

Examples

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Frequencies:

NumberCount
13
22
31

The two most frequent elements are 1 and 2.

Example 2:

Input: nums = [1], k = 1
Output: [1]

There is only one unique element, so it must be returned.

Example 3:

Input: nums = [1,2,1,2,1,2,3,1,3,2], k = 2
Output: [1,2]

Frequencies:

NumberCount
14
24
32

The top two elements are 1 and 2.

First Thought: Count and Sort

The most direct approach is:

  1. Count how many times each number appears.
  2. Sort the numbers by frequency.
  3. Take the first k.
from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        count = Counter(nums)

        items = sorted(
            count.items(),
            key=lambda item: item[1],
            reverse=True,
        )

        return [num for num, freq in items[:k]]

This is correct and simple.

But sorting all unique numbers costs:

O(m log m)

where m is the number of unique elements.

We can do better with bucket sort.

Key Insight

A frequency cannot be larger than len(nums).

So instead of sorting by frequency, create buckets.

Each bucket index represents a frequency.

bucket[f] = all numbers that appear exactly f times

For example:

nums = [1,1,1,2,2,3]

Frequencies:

1 -> 3
2 -> 2
3 -> 1

Buckets:

bucket[1] = [3]
bucket[2] = [2]
bucket[3] = [1]

Then scan buckets from high frequency to low frequency.

Collect numbers until we have k elements.

Algorithm

  1. Count frequencies using a hash map.
  2. Create len(nums) + 1 empty buckets.
  3. For each (num, freq):
    1. Put num into buckets[freq].
  4. Scan freq from len(nums) down to 1.
  5. Add numbers from each bucket into the answer.
  6. Stop once the answer has k elements.

Walkthrough

Use:

nums = [1,1,1,2,2,3]
k = 2

Count frequencies:

1 -> 3
2 -> 2
3 -> 1

Create buckets:

buckets[0] = []
buckets[1] = [3]
buckets[2] = [2]
buckets[3] = [1]
buckets[4] = []
buckets[5] = []
buckets[6] = []

Scan from high to low.

At frequency 3, find:

[1]

Answer becomes:

[1]

At frequency 2, find:

[2]

Answer becomes:

[1, 2]

Now we have k = 2 elements, so return.

Correctness

The frequency map stores the exact number of times each unique value appears in nums.

Each number is placed into the bucket matching its frequency. Therefore, buckets[f] contains exactly the numbers whose frequency is f.

The algorithm scans buckets from the largest possible frequency down to the smallest. This means every number collected earlier has frequency greater than or equal to every number collected later.

Since the answer is unique and we stop after collecting k numbers, the returned list contains exactly the k most frequent elements.

The output order does not matter, so the order in which elements are collected from the buckets is acceptable.

Complexity

Let n = len(nums).

MetricValueWhy
TimeO(n)Count once, bucket once, scan buckets once
SpaceO(n)Frequency map and buckets

The bucket array has length n + 1.

Implementation

from collections import Counter

class Solution:
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        count = Counter(nums)

        buckets = [[] for _ in range(len(nums) + 1)]

        for num, freq in count.items():
            buckets[freq].append(num)

        ans = []

        for freq in range(len(buckets) - 1, 0, -1):
            for num in buckets[freq]:
                ans.append(num)

                if len(ans) == k:
                    return ans

        return ans

Code Explanation

Count each number:

count = Counter(nums)

Create buckets by possible frequency:

buckets = [[] for _ in range(len(nums) + 1)]

Place each number into its frequency bucket:

for num, freq in count.items():
    buckets[freq].append(num)

Scan from highest frequency to lowest:

for freq in range(len(buckets) - 1, 0, -1):

Append numbers until we have k elements:

ans.append(num)

if len(ans) == k:
    return ans

The final return ans is only a safety return. The problem guarantees that k is valid.

Heap Alternative

A heap solution is also common.

It keeps only k elements in a min-heap. This is useful when the number of unique elements is large and k is small.

Its complexity is:

MetricValue
TimeO(n log k)
SpaceO(n + k)

Bucket sort is simpler here because frequencies are bounded by n.

Testing

Because the answer can be returned in any order, tests should compare sets.

def run_tests():
    s = Solution()

    assert set(s.topKFrequent([1, 1, 1, 2, 2, 3], 2)) == {1, 2}
    assert set(s.topKFrequent([1], 1)) == {1}
    assert set(s.topKFrequent([1, 2, 1, 2, 1, 2, 3, 1, 3, 2], 2)) == {1, 2}

    assert set(s.topKFrequent([4, 4, 4, 6, 6, 7], 1)) == {4}
    assert set(s.topKFrequent([-1, -1, -2, -2, -2, 3], 2)) == {-2, -1}
    assert set(s.topKFrequent([5, 3, 1, 1, 1, 3, 73, 1], 2)) == {1, 3}

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
[1,1,1,2,2,3], 2Standard sample
[1], 1Single element
Equal top frequenciesUnique top-k set still exists
k = 1Only most frequent element
Negative numbersHash map handles signed values
Mixed valuesChecks general counting behavior