LeetCode 432: All O'one Data Structure

Problem Restatement

We need to design a data structure that stores string counts.

It must support these operations:

Operation	Meaning
`inc(key)`	Increase `key` count by `1`; insert it with count `1` if missing
`dec(key)`	Decrease `key` count by `1`; remove it if count becomes `0`
`getMaxKey()`	Return any key with the largest count
`getMinKey()`	Return any key with the smallest count

If the data structure is empty, both getMaxKey() and getMinKey() should return an empty string.

Each function must run in O(1) average time. The official statement also guarantees that key exists before dec(key) is called.

Input and Output

Method	Input	Output
`AllOne()`	none	initializes the object
`inc(key)`	string `key`	none
`dec(key)`	string `key`	none
`getMaxKey()`	none	any key with maximum count, or `""`
`getMinKey()`	none	any key with minimum count, or `""`

The important constraint is not only correctness. The data structure must keep all operations in average constant time.

Examples

Suppose we run:

obj = AllOne()
obj.inc("hello")
obj.inc("hello")
obj.inc("world")

Now the counts are:

Key	Count
`hello`	2
`world`	1

So:

obj.getMaxKey()

can return:

"hello"

and:

obj.getMinKey()

can return:

"world"

Now run:

obj.dec("hello")

The counts become:

Key	Count
`hello`	1
`world`	1

Now either key can be returned by getMaxKey() or getMinKey() because both have the same count.

First Thought: Hash Map Only

The simplest data structure is:

count = {
    "hello": 2,
    "world": 1,
}

Then:

inc(key)
dec(key)

are easy.

But getMaxKey() and getMinKey() become slow because we need to scan every key to find the largest or smallest count.

That takes:

O(n)

where n is the number of keys.

The problem requires O(1) average time for every operation, so a hash map alone is not enough.

Key Insight

We need two things at the same time:

Need	Data Structure
Find a key’s current count quickly	Hash map
Find current minimum and maximum count quickly	Ordered count buckets

The standard solution uses a doubly linked list of buckets.

Each bucket stores:

Field	Meaning
`count`	The count represented by this bucket
`keys`	A set of keys that currently have this count
`prev`	Previous smaller count bucket
`next`	Next larger count bucket

The linked list is sorted by count:

head <-> count 1 <-> count 2 <-> count 5 <-> tail

We also keep:

key_to_bucket[key] = bucket

So when a key changes from count c to count c + 1, we already know where it is. We only need to move it to the adjacent count bucket.

If that adjacent bucket does not exist, we create it.

Because a key only changes by 1, it never needs to jump far across the list.

That is the reason inc and dec can be O(1).

Data Structure

We use two sentinel nodes:

head <-> ... <-> tail

head.next is always the minimum count bucket.

tail.prev is always the maximum count bucket.

The sentinels avoid special cases when inserting or deleting buckets at the ends.

Bucket node:

class Bucket:
    def __init__(self, count: int):
        self.count = count
        self.keys = set()
        self.prev = None
        self.next = None

Main structure:

self.key_to_bucket = {}
self.head = Bucket(0)
self.tail = Bucket(0)
self.head.next = self.tail
self.tail.prev = self.head

Algorithm

`inc(key)`

If the key is new:

It should move into count 1.
If the first bucket already has count 1, use it.
Otherwise insert a new count 1 bucket after head.
Add the key to that bucket.
Store key_to_bucket[key].

If the key already exists:

Find its current bucket.
Its new count is bucket.count + 1.
If the next bucket has that count, move the key there.
Otherwise create a new bucket after the current bucket.
Remove the key from the old bucket.
Delete the old bucket if it becomes empty.

`dec(key)`

The problem guarantees the key exists.

If its current count is 1:

Remove the key from its bucket.
Delete the key from key_to_bucket.
Remove the bucket if empty.

If its current count is greater than 1:

Its new count is bucket.count - 1.
If the previous bucket has that count, move the key there.
Otherwise create a new bucket before the current bucket.
Remove the key from the old bucket.
Delete the old bucket if empty.

`getMaxKey()`

If the structure is empty, return "".

Otherwise, return any key from:

self.tail.prev.keys

`getMinKey()`

If the structure is empty, return "".

Otherwise, return any key from:

self.head.next.keys

Correctness

The linked list stores one bucket for each count that currently appears among the keys. The buckets are kept in increasing count order.

When inc(key) is called for a new key, the key is placed into the count 1 bucket. If that bucket does not exist, the algorithm creates it immediately after head, which is the correct position for the smallest positive count.

When inc(key) is called for an existing key in bucket c, the key’s count becomes c + 1. Since no count between c and c + 1 exists, the correct destination is either the next bucket if it already has count c + 1, or a newly inserted bucket immediately after the current bucket.

The same reasoning applies to dec(key). A key with count c moves to count c - 1. The correct destination is either the previous bucket if it has count c - 1, or a newly inserted bucket immediately before the current bucket. If the count becomes 0, the key is removed.

Whenever a bucket loses its last key, the algorithm removes that bucket. Therefore every remaining bucket represents at least one existing key.

Since the bucket list is always sorted by count, the first real bucket contains keys with the minimum count, and the last real bucket contains keys with the maximum count. Thus getMinKey() and getMaxKey() return correct keys.

Complexity

Operation	Time	Why
`inc`	`O(1)` average	Hash lookup, set add/remove, constant linked-list edits
`dec`	`O(1)` average	Hash lookup, set add/remove, constant linked-list edits
`getMaxKey`	`O(1)` average	Read from last bucket
`getMinKey`	`O(1)` average	Read from first bucket

Space complexity is:

O(n)

where n is the number of keys.

Implementation

class Bucket:
    def __init__(self, count: int):
        self.count = count
        self.keys = set()
        self.prev = None
        self.next = None

class AllOne:
    def __init__(self):
        self.key_to_bucket = {}

        self.head = Bucket(0)
        self.tail = Bucket(0)

        self.head.next = self.tail
        self.tail.prev = self.head

    def _insert_after(self, node: Bucket, new_node: Bucket) -> None:
        nxt = node.next

        node.next = new_node
        new_node.prev = node

        new_node.next = nxt
        nxt.prev = new_node

    def _remove_bucket(self, node: Bucket) -> None:
        prev_node = node.prev
        next_node = node.next

        prev_node.next = next_node
        next_node.prev = prev_node

    def inc(self, key: str) -> None:
        if key not in self.key_to_bucket:
            first = self.head.next

            if first is self.tail or first.count != 1:
                first = Bucket(1)
                self._insert_after(self.head, first)

            first.keys.add(key)
            self.key_to_bucket[key] = first
            return

        curr = self.key_to_bucket[key]
        new_count = curr.count + 1
        nxt = curr.next

        if nxt is self.tail or nxt.count != new_count:
            nxt = Bucket(new_count)
            self._insert_after(curr, nxt)

        nxt.keys.add(key)
        self.key_to_bucket[key] = nxt

        curr.keys.remove(key)

        if not curr.keys:
            self._remove_bucket(curr)

    def dec(self, key: str) -> None:
        curr = self.key_to_bucket[key]

        if curr.count == 1:
            curr.keys.remove(key)
            del self.key_to_bucket[key]

            if not curr.keys:
                self._remove_bucket(curr)

            return

        new_count = curr.count - 1
        prev_node = curr.prev

        if prev_node is self.head or prev_node.count != new_count:
            prev_node = Bucket(new_count)
            self._insert_after(curr.prev, prev_node)

        prev_node.keys.add(key)
        self.key_to_bucket[key] = prev_node

        curr.keys.remove(key)

        if not curr.keys:
            self._remove_bucket(curr)

    def getMaxKey(self) -> str:
        if self.tail.prev is self.head:
            return ""

        return next(iter(self.tail.prev.keys))

    def getMinKey(self) -> str:
        if self.head.next is self.tail:
            return ""

        return next(iter(self.head.next.keys))

Code Explanation

The Bucket class represents one count value:

self.count = count
self.keys = set()

All keys inside the same bucket have the same count.

The doubly linked list uses two sentinels:

self.head = Bucket(0)
self.tail = Bucket(0)

The real buckets live between them.

The helper _insert_after inserts a bucket in constant time:

def _insert_after(self, node, new_node):

The helper _remove_bucket removes an empty bucket in constant time.

In inc, a new key always goes to count 1.

if key not in self.key_to_bucket:

If no count 1 bucket exists, we create one after head.

For an existing key, we move it from count c to count c + 1.

new_count = curr.count + 1

The next bucket is the only possible existing destination.

In dec, a key with count 1 is removed completely.

if curr.count == 1:

Otherwise, it moves from count c to count c - 1.

The previous bucket is the only possible existing destination.

The min and max methods are direct because the linked list is sorted:

self.head.next
self.tail.prev

Testing

def run_tests():
    obj = AllOne()

    assert obj.getMaxKey() == ""
    assert obj.getMinKey() == ""

    obj.inc("hello")
    obj.inc("hello")
    obj.inc("world")

    assert obj.getMaxKey() == "hello"
    assert obj.getMinKey() == "world"

    obj.dec("hello")

    assert obj.getMaxKey() in {"hello", "world"}
    assert obj.getMinKey() in {"hello", "world"}

    obj.dec("hello")

    assert obj.getMaxKey() == "world"
    assert obj.getMinKey() == "world"

    obj.dec("world")

    assert obj.getMaxKey() == ""
    assert obj.getMinKey() == ""

    obj.inc("a")
    obj.inc("b")
    obj.inc("b")
    obj.inc("c")
    obj.inc("c")
    obj.inc("c")

    assert obj.getMinKey() == "a"
    assert obj.getMaxKey() == "c"

    obj.dec("c")
    obj.dec("c")

    assert obj.getMaxKey() == "b"
    assert obj.getMinKey() in {"a", "c"}

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
Empty structure	Checks empty-string return
Repeated increment	Checks max bucket updates
Decrement to equal count	Checks ties
Decrement to zero	Checks key removal
Multiple buckets	Checks min and max at opposite ends
Bucket deletion	Checks empty bucket cleanup