LeetCode 692: Top K Frequent Words

Problem Restatement

We are given a list of words and an integer k.

We need to return the k most frequent words.

The ordering rules are:

Rule	Meaning
Higher frequency first	More common words come earlier
Lexicographical tie-break	If two words have the same frequency, the smaller alphabetical word comes first

The official statement defines lexicographical order using normal alphabetical string comparison. (leetcode.com)

Input and Output

Item	Meaning
Input	`words`, a list of strings, and integer `k`
Output	List of the `k` most frequent words
Primary ordering	Frequency descending
Secondary ordering	Lexicographical ascending

Example function shape:

def topKFrequent(words: list[str], k: int) -> list[str]:
    ...

Examples

Example 1:

words = ["i", "love", "leetcode", "i", "love", "coding"]
k = 2

Frequencies:

Word	Count
`"i"`	`2`
`"love"`	`2`
`"leetcode"`	`1`
`"coding"`	`1`

The two most frequent words are "i" and "love".

Both have the same count, so lexicographical order decides:

"i" < "love"

Answer:

["i", "love"]

Example 2:

words = [
    "the", "day", "is", "sunny",
    "the", "the", "the",
    "sunny", "sunny",
    "is", "is"
]
k = 4

Frequencies:

Word	Count
`"the"`	`4`
`"is"`	`3`
`"sunny"`	`2`
`"day"`	`1`

Answer:

["the", "is", "sunny", "day"]

First Thought: Count and Sort

The problem naturally splits into two parts:

Count word frequencies.
Sort words using the required ordering rules.

A hash map is ideal for counting.

Then we sort by:

Priority	Direction
Frequency	Descending
Word	Ascending

Key Insight

Python sorting supports tuple keys.

Suppose:

count[word]

stores the frequency.

Then the sorting key can be:

(-count[word], word)

Why negative frequency?

Because Python sorts ascending by default.

Using:

-count[word]

makes larger counts come first.

The second tuple element:

word

automatically handles lexicographical tie-breaking.

Algorithm

Count frequencies with a hash map.
Sort unique words using:
```
key=lambda word: (-count[word], word)
```
Return the first k words.

Walkthrough

Consider:

words = ["i", "love", "leetcode", "i", "love", "coding"]
k = 2

Frequency table:

Word	Count
`"i"`	`2`
`"love"`	`2`
`"leetcode"`	`1`
`"coding"`	`1`

Sorting key values:

Word	Key
`"i"`	`(-2, "i")`
`"love"`	`(-2, "love")`
`"leetcode"`	`(-1, "leetcode")`
`"coding"`	`(-1, "coding")`

Sorted order:

["i", "love", "coding", "leetcode"]

Take the first 2 words:

["i", "love"]

Why `"coding"` Comes Before `"leetcode"`

Both have frequency 1.

So the second ordering rule applies:

"coding" < "leetcode"

Alphabetically, "coding" comes first.

Correctness

The frequency map correctly stores how many times each word appears in the input.

The sorting rule uses the tuple:

(-frequency, word)

Python tuple comparison compares elements from left to right.

Therefore:

Words with larger frequencies appear earlier because larger frequencies correspond to smaller negative values.
If two words have equal frequencies, the second tuple element compares the words lexicographically in ascending order.

Thus, the sorted list exactly matches the required ordering from the problem statement.

Returning the first k words therefore returns the k most frequent words in the correct order.

Complexity

Let:

Symbol	Meaning
`n`	Total number of input words
`u`	Number of unique words

Metric	Value	Why
Time	`O(n + u log u)`	Count frequencies, then sort unique words
Space	`O(u)`	Frequency map and sorted unique list

Implementation

from collections import Counter

class Solution:
    def topKFrequent(self, words: list[str], k: int) -> list[str]:
        count = Counter(words)

        ordered = sorted(
            count.keys(),
            key=lambda word: (-count[word], word),
        )

        return ordered[:k]

Code Explanation

We count frequencies using Counter:

count = Counter(words)

For example:

Counter({
    "i": 2,
    "love": 2,
    "leetcode": 1,
    "coding": 1,
})

Then we sort the unique words:

ordered = sorted(
    count.keys(),
    key=lambda word: (-count[word], word),
)

The key:

(-count[word], word)

means:

Part	Purpose
`-count[word]`	Higher frequency first
`word`	Alphabetical tie-break

Finally:

return ordered[:k]

returns the top k words.

Heap Alternative

The sorting solution is simple and efficient.

A heap solution is useful when k is much smaller than the number of unique words.

Example heap approach:

import heapq
from collections import Counter

class Solution:
    def topKFrequent(self, words: list[str], k: int) -> list[str]:
        count = Counter(words)

        heap = [(-freq, word) for word, freq in count.items()]
        heapq.heapify(heap)

        answer = []

        for _ in range(k):
            _, word = heapq.heappop(heap)
            answer.append(word)

        return answer

This uses the same ordering idea:

(-freq, word)

because Python heaps are min-heaps.

Testing

def run_tests():
    s = Solution()

    assert s.topKFrequent(
        ["i", "love", "leetcode", "i", "love", "coding"],
        2,
    ) == ["i", "love"]

    assert s.topKFrequent(
        [
            "the", "day", "is", "sunny",
            "the", "the", "the",
            "sunny", "sunny",
            "is", "is"
        ],
        4,
    ) == ["the", "is", "sunny", "day"]

    assert s.topKFrequent(
        ["a", "b", "c"],
        2,
    ) == ["a", "b"]

    assert s.topKFrequent(
        ["a", "aa", "aaa"],
        3,
    ) == ["a", "aa", "aaa"]

    assert s.topKFrequent(
        ["x", "x", "y", "y", "z"],
        2,
    ) == ["x", "y"]

    print("all tests passed")

run_tests()

Test meaning:

Test	Expected	Why
`["i","love","leetcode","i","love","coding"]`, `k=2`	`["i","love"]`	Official example
Sunny/day example	`["the","is","sunny","day"]`	Official example
Equal frequencies	`["a","b"]`	Lexicographical ordering
Increasing string lengths	`["a","aa","aaa"]`	Alphabetical ordering check
Mixed frequencies	`["x","y"]`	Frequency descending