A clear explanation of designing an LFU cache with O(1) average get and put operations.
Problem Restatement
We need to design a Least Frequently Used cache.
The cache supports two operations:
get(key)
put(key, value)The cache has a fixed capacity.
If the cache is full and we need to insert a new key, we remove the least frequently used key.
Each key has a use counter.
A key’s frequency increases when:
get(key)is called on it.put(key, value)is called on an existing key.
When multiple keys have the same lowest frequency, we remove the least recently used key among them.
The required average time complexity for both get and put is O(1). The official statement also specifies LFU eviction, LRU tie-breaking, frequency counters, and O(1) average operations.
Input and Output
| Operation | Meaning |
|---|---|
LFUCache(capacity) | Initialize the cache |
get(key) | Return the value if present, otherwise -1 |
put(key, value) | Insert or update a key-value pair |
| Eviction rule | Remove lowest-frequency key |
| Tie rule | If frequencies tie, remove least recently used key |
| Required complexity | O(1) average for get and put |
Class shape:
class LFUCache:
def __init__(self, capacity: int):
...
def get(self, key: int) -> int:
...
def put(self, key: int, value: int) -> None:
...Examples
Use this operation sequence:
cache = LFUCache(2)
cache.put(1, 1)
cache.put(2, 2)
cache.get(1)
cache.put(3, 3)
cache.get(2)
cache.get(3)
cache.put(4, 4)
cache.get(1)
cache.get(3)
cache.get(4)Step by step:
put(1, 1)Cache:
| Key | Value | Frequency |
|---|---|---|
| 1 | 1 | 1 |
Then:
put(2, 2)Cache:
| Key | Value | Frequency |
|---|---|---|
| 1 | 1 | 1 |
| 2 | 2 | 1 |
Then:
get(1)Returns:
1Key 1 frequency becomes 2.
Cache:
| Key | Value | Frequency |
|---|---|---|
| 1 | 1 | 2 |
| 2 | 2 | 1 |
Then:
put(3, 3)The cache is full.
Key 2 has the lowest frequency, so remove key 2.
Insert key 3 with frequency 1.
Then:
get(2)Returns:
-1Then:
get(3)Returns:
3Key 3 frequency becomes 2.
Now keys 1 and 3 both have frequency 2.
Then:
put(4, 4)The cache is full.
Both key 1 and key 3 have frequency 2, so remove the least recently used among them.
Key 1 is older, so remove key 1.
Insert key 4.
Final checks:
get(1) == -1
get(3) == 3
get(4) == 4First Thought: Store Everything in One Dictionary
We can store:
key -> (value, frequency, time)Then on eviction, scan all keys and find the key with:
- Smallest frequency.
- Oldest time among that frequency.
This is simple.
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.time = 0
self.data = {}
def get(self, key: int) -> int:
if key not in self.data:
return -1
value, freq, _ = self.data[key]
self.time += 1
self.data[key] = (value, freq + 1, self.time)
return value
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
self.time += 1
if key in self.data:
_, freq, _ = self.data[key]
self.data[key] = (value, freq + 1, self.time)
return
if len(self.data) == self.capacity:
victim = min(
self.data,
key=lambda k: (self.data[k][1], self.data[k][2]),
)
del self.data[victim]
self.data[key] = (value, 1, self.time)This logic is correct, but eviction scans all cache entries.
Problem With This Approach
If the cache has n keys, eviction costs:
O(n)because we search through every key.
The problem requires get and put to run in O(1) average time.
So we need a design that can find the eviction victim directly.
Key Insight
We need to answer three questions quickly:
- Given a key, where is its node?
- Given a frequency, which keys have that frequency?
- What is the current minimum frequency?
This suggests three pieces of state:
| State | Purpose |
|---|---|
key_to_node | Find a node by key in O(1) |
freq_to_nodes | Store nodes grouped by frequency |
min_freq | Track the lowest frequency currently in the cache |
Inside each frequency group, we need LRU order.
So each frequency maps to an ordered collection of keys.
In Python, OrderedDict gives this cleanly:
freq -> OrderedDict of key -> nodeThe least recently used key in a frequency group is the first key in that OrderedDict.
Data Structure
Each node stores:
key
value
freqWe maintain:
key_to_node: dict[int, Node]
freq_to_nodes: dict[int, OrderedDict[int, Node]]
min_freq: int
capacity: intThe cache invariant is:
For every key in the cache:
key_to_node[key]points to its node.- The same node appears in
freq_to_nodes[node.freq]. min_freqis the smallest frequency among all nodes.
When a key is accessed, its frequency increases by 1.
So we move it:
freq f list -> freq f + 1 listand place it at the most recently used position in the new frequency list.
Algorithm for get
For get(key):
- If
keydoes not exist, return-1. - Find its node.
- Increase its frequency using a helper.
- Return the node value.
The helper _touch(node) does this:
- Remove node from its current frequency group.
- If that group becomes empty and it was
min_freq, increasemin_freq. - Increment
node.freq. - Insert node into the new frequency group as most recently used.
Algorithm for put
For put(key, value):
If capacity is 0, do nothing.
If key already exists:
- Update its value.
- Increase its frequency using
_touch.
Otherwise, we insert a new key.
If the cache is full:
- Look at
freq_to_nodes[min_freq]. - Remove the first item from that ordered dictionary.
- Delete that key from
key_to_node.
Then insert the new node with frequency 1.
Set:
min_freq = 1because every new key starts with frequency 1.
Correctness
Every get of an existing key returns the stored value, then increases that key’s frequency by one. This matches the cache rules.
Every put of an existing key updates the value, then increases that key’s frequency by one. This also matches the cache rules.
When inserting a new key into a full cache, the algorithm evicts from freq_to_nodes[min_freq]. By invariant, min_freq is the smallest frequency among all keys, so every key in that group is least frequently used.
Inside that group, keys are stored in access order. The first key is the least recently used among keys with the minimum frequency. Removing that key exactly implements the tie-breaking rule.
After every access, _touch moves the node from frequency f to frequency f + 1, preserving correct frequency grouping. If the old minimum-frequency group becomes empty, min_freq increases because no key with that frequency remains.
After inserting a new key, its frequency is 1, so min_freq = 1.
Therefore, every operation maintains the LFU and LRU tie-breaking invariants, and eviction always removes the correct key.
Complexity
| Operation | Time | Why |
|---|---|---|
get | O(1) average | Hash map lookup and ordered dictionary updates |
put existing key | O(1) average | Hash map lookup and one frequency move |
put new key | O(1) average | Direct eviction from min_freq group |
| Space | O(capacity) | One node per stored key |
Implementation
from collections import OrderedDict, defaultdict
class Node:
def __init__(self, key: int, value: int):
self.key = key
self.value = value
self.freq = 1
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.min_freq = 0
self.key_to_node = {}
self.freq_to_nodes = defaultdict(OrderedDict)
def get(self, key: int) -> int:
if key not in self.key_to_node:
return -1
node = self.key_to_node[key]
self._touch(node)
return node.value
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
if key in self.key_to_node:
node = self.key_to_node[key]
node.value = value
self._touch(node)
return
if len(self.key_to_node) == self.capacity:
old_key, old_node = self.freq_to_nodes[self.min_freq].popitem(last=False)
del self.key_to_node[old_key]
node = Node(key, value)
self.key_to_node[key] = node
self.freq_to_nodes[1][key] = node
self.min_freq = 1
def _touch(self, node: Node) -> None:
old_freq = node.freq
old_group = self.freq_to_nodes[old_freq]
del old_group[node.key]
if not old_group:
del self.freq_to_nodes[old_freq]
if self.min_freq == old_freq:
self.min_freq += 1
node.freq += 1
self.freq_to_nodes[node.freq][node.key] = nodeCode Explanation
The Node object stores one cache entry:
class Node:
def __init__(self, key: int, value: int):
self.key = key
self.value = value
self.freq = 1New keys start with frequency 1.
The main cache stores:
self.key_to_node = {}
self.freq_to_nodes = defaultdict(OrderedDict)key_to_node gives direct access to a node.
freq_to_nodes groups nodes by frequency.
For example:
freq_to_nodes[2]stores all keys with frequency 2, ordered from least recently used to most recently used.
In get, missing keys return -1:
if key not in self.key_to_node:
return -1Existing keys are touched:
node = self.key_to_node[key]
self._touch(node)
return node.valueIn put, zero capacity means the cache cannot store anything:
if self.capacity == 0:
returnFor an existing key, we update the value and count the operation as a use:
node.value = value
self._touch(node)For a new key, if the cache is full, we evict from the minimum-frequency group:
old_key, old_node = self.freq_to_nodes[self.min_freq].popitem(last=False)
del self.key_to_node[old_key]last=False removes the least recently used item in that frequency group.
Then we insert the new key at frequency 1:
node = Node(key, value)
self.key_to_node[key] = node
self.freq_to_nodes[1][key] = node
self.min_freq = 1The helper _touch moves a node from its old frequency group to the next one:
old_freq = node.freq
del self.freq_to_nodes[old_freq][node.key]
node.freq += 1
self.freq_to_nodes[node.freq][node.key] = nodeIf the old group becomes empty and it was the minimum frequency, min_freq increases:
if self.min_freq == old_freq:
self.min_freq += 1Testing
def run_tests():
cache = LFUCache(2)
cache.put(1, 1)
cache.put(2, 2)
assert cache.get(1) == 1
cache.put(3, 3)
assert cache.get(2) == -1
assert cache.get(3) == 3
cache.put(4, 4)
assert cache.get(1) == -1
assert cache.get(3) == 3
assert cache.get(4) == 4
zero = LFUCache(0)
zero.put(1, 1)
assert zero.get(1) == -1
cache = LFUCache(2)
cache.put(1, 1)
cache.put(2, 2)
cache.put(2, 20)
cache.put(3, 3)
assert cache.get(1) == -1
assert cache.get(2) == 20
assert cache.get(3) == 3
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Standard operation sequence | Checks LFU eviction and LRU tie-breaking |
Capacity 0 | Cache cannot store keys |
| Updating existing key | put on existing key increases frequency |
get after eviction | Confirms removed keys return -1 |
| Tie at same frequency | Confirms least recently used key is removed |