A clear explanation of maintaining the kth largest element in a stream using a fixed-size min heap.
Problem Restatement
We need to design a class called KthLargest.
The class receives:
k
numsnums is the initial stream of integers.
After that, the class supports an operation:
add(val)This adds val to the stream and returns the kth largest element among all numbers seen so far.
The kth largest element means the element at position k if all numbers are sorted from largest to smallest.
Duplicates count as separate elements.
For example:
nums = [4, 5, 8, 2]
k = 3Sorted from largest to smallest:
[8, 5, 4, 2]The 3rd largest value is:
4Input and Output
| Item | Meaning |
|---|---|
| Constructor input | k, the rank we need to maintain |
| Constructor input | nums, the initial stream values |
| Method input | val, the new value added to the stream |
| Method output | The kth largest value after inserting val |
| Important detail | Duplicates count separately |
The class shape is:
class KthLargest:
def __init__(self, k: int, nums: list[int]):
...
def add(self, val: int) -> int:
...Examples
Example:
k = 3
nums = [4, 5, 8, 2]Create the object:
kthLargest = KthLargest(3, [4, 5, 8, 2])Now call add.
First call:
kthLargest.add(3)The stream becomes:
[4, 5, 8, 2, 3]Sorted descending:
[8, 5, 4, 3, 2]The 3rd largest is 4.
Return:
4Second call:
kthLargest.add(5)The stream becomes:
[4, 5, 8, 2, 3, 5]Sorted descending:
[8, 5, 5, 4, 3, 2]The 3rd largest is 5.
Return:
5Third call:
kthLargest.add(10)Sorted descending:
[10, 8, 5, 5, 4, 3, 2]The 3rd largest is 5.
Return:
5Fourth call:
kthLargest.add(9)Sorted descending:
[10, 9, 8, 5, 5, 4, 3, 2]The 3rd largest is 8.
Return:
8Fifth call:
kthLargest.add(4)Sorted descending:
[10, 9, 8, 5, 5, 4, 4, 3, 2]The 3rd largest is still 8.
Return:
8First Thought: Sort Every Time
A simple approach is to store every number in a list.
Whenever add(val) is called:
- Append
val. - Sort the full list.
- Return the kth largest value.
For descending sort, the kth largest value is at index k - 1.
For ascending sort, the kth largest value is at index len(nums) - k.
This works, but it repeats too much work.
Problem With Sorting Every Time
If there are n values in the stream, sorting costs:
O(n log n)Doing this after every add is expensive.
But we do not need the full sorted order.
We only need the kth largest value.
That means we only need to keep track of the largest k values seen so far.
Key Insight
Use a min heap of size k.
The heap stores the k largest values seen so far.
Because it is a min heap, the smallest value among those k values is at the top.
That smallest value inside the top k values is exactly the kth largest value overall.
For example, suppose:
k = 3
stream = [10, 9, 8, 5, 4, 3]The three largest values are:
[10, 9, 8]A min heap containing these values has 8 at the top.
So the heap top is the 3rd largest value.
When a new value arrives, we push it into the heap.
If the heap grows larger than k, we remove the smallest value.
That removal keeps only the k largest values.
Algorithm
Store two fields:
self.k
self.heapDuring initialization:
- Save
k. - Create an empty min heap.
- Add each number from
numsusing the sameaddmethod.
During add(val):
- Push
valinto the heap. - If the heap size is greater than
k, pop the smallest value. - Return the heap top.
The heap top is:
self.heap[0]Correctness
The heap is maintained with this invariant:
After each insertion, self.heap contains the k largest values among all values seen so far, or all values if fewer than k values have been seen.
During initialization, every value in nums is inserted through add, so the invariant is built one value at a time.
Now consider one call to add(val).
First, val is pushed into the heap. At this moment, the heap contains the previous kept values plus the new value.
If the heap size is at most k, then all values seen so far are kept, so the invariant holds.
If the heap size becomes k + 1, we remove the smallest value from the heap. Among these k + 1 candidate values, the smallest one cannot be one of the k largest values. Removing it leaves exactly the k largest values seen so far.
Therefore, after every add, the heap contains the k largest values.
The heap is a min heap, so self.heap[0] is the smallest value among those k largest values. That value is exactly the kth largest value overall.
So add returns the correct answer.
Complexity
Let k be the rank we maintain and n be the length of the initial nums.
| Operation | Time | Space | Why |
|---|---|---|---|
| Constructor | O(n log k) | O(k) | Inserts each initial value into a heap of size at most k |
add | O(log k) | O(k) | Push and possibly pop from a heap of size at most k |
| Return heap top | O(1) | O(1) | The kth largest value is at heap[0] |
Implementation
import heapq
class KthLargest:
def __init__(self, k: int, nums: list[int]):
self.k = k
self.heap = []
for num in nums:
self.add(num)
def add(self, val: int) -> int:
heapq.heappush(self.heap, val)
if len(self.heap) > self.k:
heapq.heappop(self.heap)
return self.heap[0]Code Explanation
We use Python’s heapq, which provides a min heap:
import heapqThe constructor stores k:
self.k = kThen it creates an empty heap:
self.heap = []Instead of writing separate initialization logic, we insert each initial number using add:
for num in nums:
self.add(num)This keeps the heap rule consistent from the beginning.
Inside add, we first push the new value:
heapq.heappush(self.heap, val)If the heap now has more than k values, remove the smallest value:
if len(self.heap) > self.k:
heapq.heappop(self.heap)This keeps only the k largest values.
Finally, return the smallest value among those k largest values:
return self.heap[0]That value is the kth largest element in the full stream.
Testing
We can test the sample behavior directly.
def test_kth_largest():
kth = KthLargest(3, [4, 5, 8, 2])
assert kth.add(3) == 4
assert kth.add(5) == 5
assert kth.add(10) == 5
assert kth.add(9) == 8
assert kth.add(4) == 8
kth = KthLargest(1, [])
assert kth.add(-3) == -3
assert kth.add(-2) == -2
assert kth.add(-4) == -2
assert kth.add(0) == 0
assert kth.add(4) == 4
kth = KthLargest(2, [0])
assert kth.add(-1) == -1
assert kth.add(1) == 0
assert kth.add(-2) == 0
assert kth.add(-4) == 0
assert kth.add(3) == 1
print("all tests passed")Test coverage:
| Test | Why |
|---|---|
| Official-style stream | Confirms normal heap behavior |
k = 1 | Confirms largest element tracking |
| Empty initial array | Confirms constructor handles no initial values |
| Negative values | Confirms ordering works with negatives |
| Duplicates and changing kth value | Confirms the heap updates correctly |