A clear explanation of designing a hit counter for the last 5 minutes using a queue with compressed timestamps.
Problem Restatement
We need to design a hit counter.
The counter supports two operations:
| Method | Meaning |
|---|---|
hit(timestamp) | Record one hit at timestamp |
getHits(timestamp) | Return how many hits happened in the past 300 seconds |
The timestamp is given in seconds.
Calls arrive in chronological order, so timestamps are monotonically increasing. Multiple hits may happen at the same timestamp. The time window is the past 5 minutes, which means the past 300 seconds.
For a query at time t, we count hits whose timestamp is greater than:
t - 300So at time 301, a hit at time 1 is expired because:
301 - 1 = 300The hit at time 1 is no longer inside the last 300 seconds.
Input and Output
| Method | Input | Output |
|---|---|---|
HitCounter() | None | Initializes the counter |
hit(timestamp) | Integer timestamp | None |
getHits(timestamp) | Integer timestamp | Number of recent hits |
Example class shape:
class HitCounter:
def __init__(self):
...
def hit(self, timestamp: int) -> None:
...
def getHits(self, timestamp: int) -> int:
...Examples
Example calls:
counter = HitCounter()
counter.hit(1)
counter.hit(2)
counter.hit(3)
counter.getHits(4)
counter.hit(300)
counter.getHits(300)
counter.getHits(301)Expected results:
3
4
3Step by step:
| Operation | Result | Why |
|---|---|---|
hit(1) | Record hit at 1 | |
hit(2) | Record hit at 2 | |
hit(3) | Record hit at 3 | |
getHits(4) | 3 | Hits at 1, 2, 3 are inside the window |
hit(300) | Record hit at 300 | |
getHits(300) | 4 | Hits at 1, 2, 3, 300 are inside the window |
getHits(301) | 3 | Hit at 1 is expired |
At timestamp 301, the valid timestamps are:
2, 3, 300because a hit must be greater than:
301 - 300 = 1First Thought: Store Every Hit
The simplest design is to store every hit timestamp in a list.
class HitCounter:
def __init__(self):
self.hits = []
def hit(self, timestamp: int) -> None:
self.hits.append(timestamp)
def getHits(self, timestamp: int) -> int:
count = 0
for hit_time in self.hits:
if hit_time > timestamp - 300:
count += 1
return countThis works, but getHits scans all hits ever recorded.
If the system receives many hits, old hits stay in the list forever and every query becomes slower.
Key Insight
Since timestamps arrive in chronological order, old hits are always at the front.
When a hit becomes older than the 300 second window, it will never become valid again.
So we can delete expired hits from the front.
A queue fits this exactly.
To handle many hits at the same timestamp more efficiently, store pairs:
(timestamp, count)For example, three hits at timestamp 10 become:
(10, 3)instead of:
10, 10, 10We also keep a running total of hits currently in the queue.
Algorithm
Maintain:
| Variable | Meaning |
|---|---|
queue | Timestamp groups inside or near the active window |
total | Number of hits currently stored in queue |
For hit(timestamp):
- If the queue tail has the same timestamp, increase its count.
- Otherwise, append a new pair
[timestamp, 1]. - Increase
total.
For getHits(timestamp):
- While the queue is not empty and the front timestamp is expired:
- Remove the front pair.
- Subtract its count from
total.
- Return
total.
A hit expires when:
hit_time <= timestamp - 300Equivalently:
timestamp - hit_time >= 300Correctness
The queue stores hit groups in chronological order because calls are made with monotonically increasing timestamps.
When getHits(timestamp) is called, any hit at time hit_time <= timestamp - 300 lies outside the past 300 seconds. Since the queue is sorted by time, all such expired hits appear at the front. The algorithm removes exactly those expired groups.
After removing expired groups, every remaining group has timestamp greater than timestamp - 300, so every remaining hit belongs to the required window.
The variable total is increased once for every recorded hit and decreased by the exact count of each expired group. Therefore, after cleanup, total equals the number of hits in the past 300 seconds.
So getHits returns the correct count.
Complexity
Let g be the number of timestamp groups currently stored.
| Operation | Time | Space |
|---|---|---|
hit | O(1) | O(1) extra |
getHits | Amortized O(1) | O(g) total |
A single expired group is removed only once across the whole lifetime of the object.
So although one getHits call may remove several groups, the amortized cost per operation is constant.
Implementation
from collections import deque
class HitCounter:
def __init__(self):
self.queue = deque()
self.total = 0
def hit(self, timestamp: int) -> None:
if self.queue and self.queue[-1][0] == timestamp:
self.queue[-1][1] += 1
else:
self.queue.append([timestamp, 1])
self.total += 1
def getHits(self, timestamp: int) -> int:
while self.queue and self.queue[0][0] <= timestamp - 300:
_, count = self.queue.popleft()
self.total -= count
return self.totalCode Explanation
The queue stores grouped hits:
self.queue = deque()Each item is:
[timestamp, count]The running total stores the current number of non-expired hits:
self.total = 0When a hit arrives, we merge it with the last group if the timestamp is the same:
if self.queue and self.queue[-1][0] == timestamp:
self.queue[-1][1] += 1Otherwise, we add a new group:
self.queue.append([timestamp, 1])Every hit increases the total:
self.total += 1Before answering a query, we remove expired groups:
while self.queue and self.queue[0][0] <= timestamp - 300:For each removed group, subtract its count:
_, count = self.queue.popleft()
self.total -= countThen return the remaining total:
return self.totalAlternative Implementation: Fixed 300 Buckets
Since the window is exactly 300 seconds, another common solution uses two arrays of length 300.
Each bucket stores:
| Array | Meaning |
|---|---|
times[i] | Timestamp currently stored in bucket i |
hits[i] | Number of hits at that timestamp |
The bucket index is:
timestamp % 300Code:
class HitCounter:
def __init__(self):
self.times = [0] * 300
self.hits = [0] * 300
def hit(self, timestamp: int) -> None:
index = timestamp % 300
if self.times[index] != timestamp:
self.times[index] = timestamp
self.hits[index] = 1
else:
self.hits[index] += 1
def getHits(self, timestamp: int) -> int:
total = 0
for i in range(300):
if timestamp - self.times[i] < 300:
total += self.hits[i]
return totalThis version has:
| Operation | Time |
|---|---|
hit | O(1) |
getHits | O(300), treated as O(1) |
It uses fixed memory, which is useful when the hit volume is very large.
Testing
def run_tests():
counter = HitCounter()
counter.hit(1)
counter.hit(2)
counter.hit(3)
assert counter.getHits(4) == 3
counter.hit(300)
assert counter.getHits(300) == 4
assert counter.getHits(301) == 3
counter = HitCounter()
counter.hit(1)
counter.hit(1)
counter.hit(1)
assert counter.getHits(1) == 3
assert counter.getHits(300) == 3
assert counter.getHits(301) == 0
counter = HitCounter()
counter.hit(100)
counter.hit(200)
counter.hit(399)
assert counter.getHits(399) == 2
assert counter.getHits(400) == 2
assert counter.getHits(500) == 1
assert counter.getHits(699) == 0
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Standard sequence | Checks the main example behavior |
Boundary at 300 seconds | Confirms expiration uses <= timestamp - 300 |
| Multiple hits at same timestamp | Checks grouped counts |
| Far future query | Confirms old hits are removed correctly |