Skip to content

LeetCode 706: Design HashMap

A clear explanation of designing a hash map without using built-in hash table libraries.

Problem Restatement

We need to design a class called MyHashMap.

The class should support three operations:

OperationMeaning
put(key, value)Insert or update the value for key
get(key)Return the value for key, or -1 if missing
remove(key)Remove key and its value if it exists

We cannot use built-in hash table libraries.

A hash map stores key-value pairs. Each key maps to one value.

If put is called with a key that already exists, we update the old value.

Input and Output

ItemMeaning
ConstructorCreates an empty hash map
put(key, value)Returns nothing
get(key)Returns the mapped value, or -1
remove(key)Returns nothing
Existing key in putUpdate its value
Missing key in removeDo nothing

The class shape is:

class MyHashMap:

    def __init__(self):
        ...

    def put(self, key: int, value: int) -> None:
        ...

    def get(self, key: int) -> int:
        ...

    def remove(self, key: int) -> None:
        ...

Examples

Example:

myHashMap = MyHashMap()
myHashMap.put(1, 1)
myHashMap.put(2, 2)
myHashMap.get(1)
myHashMap.get(3)
myHashMap.put(2, 1)
myHashMap.get(2)
myHashMap.remove(2)
myHashMap.get(2)

The returned values are:

[None, None, None, 1, -1, None, 1, None, -1]

Step by step:

OperationMap After OperationReturn
put(1, 1){1: 1}None
put(2, 2){1: 1, 2: 2}None
get(1){1: 1, 2: 2}1
get(3){1: 1, 2: 2}-1
put(2, 1){1: 1, 2: 1}None
get(2){1: 1, 2: 1}1
remove(2){1: 1}None
get(2){1: 1}-1

First Thought: Direct Addressing

If the key range is small enough, we can use the key directly as an array index.

For example:

data[key] = value

For missing keys, store:

-1

Then:

put(key, value): data[key] = value
get(key):        return data[key]
remove(key):     data[key] = -1

This is simple and fast.

Problem With Direct Addressing

Direct addressing allocates memory for every possible key, even if only a few keys are used.

If keys are in the range:

0 <= key <= 1_000_000

then we need an array of size:

1_000_001

For this problem, that is acceptable.

But a real hash map should avoid allocating space proportional to the maximum possible key. It should allocate buckets and store only keys that are actually inserted.

Key Insight

Use a bucket array.

A hash function maps each key to one bucket:

bucket_index = key % bucket_count

Each bucket stores pairs:

[key, value]

Different keys can map to the same bucket. This is called a collision.

For example, if:

bucket_count = 1009

then keys 1, 1010, and 2019 all map to bucket 1.

To handle this, each bucket can be a list of key-value pairs.

When we operate on a key, we only scan its bucket.

Algorithm

Use an array of buckets.

Each bucket is a list of pairs.

During initialization:

  1. Choose a bucket count.
  2. Create that many empty lists.

For put(key, value):

  1. Compute the bucket index.
  2. Search the bucket for key.
  3. If found, update its value.
  4. If not found, append [key, value].

For get(key):

  1. Compute the bucket index.
  2. Search the bucket for key.
  3. If found, return its value.
  4. Otherwise, return -1.

For remove(key):

  1. Compute the bucket index.
  2. Search the bucket for key.
  3. If found, remove that pair.
  4. Otherwise, do nothing.

Correctness

Each key always maps to one deterministic bucket, computed by key % bucket_count.

When put inserts a new key, it places the key-value pair in that bucket. When put sees an existing key in the bucket, it updates the stored value instead of adding a duplicate pair. Therefore, each key has at most one active mapping.

When get searches for a key, it computes the same bucket. Since every inserted key is stored in its computed bucket, searching that bucket is sufficient. If the key is found, the method returns its latest value. If the key is missing from that bucket, then the key has no mapping in the map.

When remove searches for a key, it also computes the same bucket. If the key is found, removing its pair deletes the mapping. If the key is not found, no mapping exists, so doing nothing is correct.

Therefore, the data structure satisfies the required hash map behavior.

Complexity

Let b be the number of buckets and m be the number of pairs in one bucket.

OperationAverage TimeWorst-case TimeWhy
putO(1)O(m)Search one bucket, then update or append
getO(1)O(m)Search one bucket
removeO(1)O(m)Search one bucket, then remove
SpaceO(b + n)O(b + n)Bucket array plus inserted pairs

With a reasonable bucket count, most buckets stay small.

Implementation

class MyHashMap:

    def __init__(self):
        self.size = 1009
        self.buckets = [[] for _ in range(self.size)]

    def _hash(self, key: int) -> int:
        return key % self.size

    def put(self, key: int, value: int) -> None:
        index = self._hash(key)
        bucket = self.buckets[index]

        for pair in bucket:
            if pair[0] == key:
                pair[1] = value
                return

        bucket.append([key, value])

    def get(self, key: int) -> int:
        index = self._hash(key)
        bucket = self.buckets[index]

        for stored_key, stored_value in bucket:
            if stored_key == key:
                return stored_value

        return -1

    def remove(self, key: int) -> None:
        index = self._hash(key)
        bucket = self.buckets[index]

        for i, pair in enumerate(bucket):
            if pair[0] == key:
                bucket.pop(i)
                return

Code Explanation

The constructor creates a fixed number of buckets:

self.size = 1009
self.buckets = [[] for _ in range(self.size)]

The hash helper maps a key to a bucket index:

def _hash(self, key: int) -> int:
    return key % self.size

The put method first finds the correct bucket:

index = self._hash(key)
bucket = self.buckets[index]

Then it scans that bucket.

If the key already exists, update the value:

for pair in bucket:
    if pair[0] == key:
        pair[1] = value
        return

If the key does not exist, append a new pair:

bucket.append([key, value])

The get method searches the same bucket:

for stored_key, stored_value in bucket:
    if stored_key == key:
        return stored_value

If no pair matches, return:

-1

The remove method finds the matching pair and deletes it:

for i, pair in enumerate(bucket):
    if pair[0] == key:
        bucket.pop(i)
        return

Returning immediately after removal avoids modifying the list while continuing to iterate.

Alternative: Direct Addressing

Because the key range is bounded, direct addressing is also valid.

class MyHashMap:

    def __init__(self):
        self.data = [-1] * 1_000_001

    def put(self, key: int, value: int) -> None:
        self.data[key] = value

    def get(self, key: int) -> int:
        return self.data[key]

    def remove(self, key: int) -> None:
        self.data[key] = -1

This version is shorter and gives guaranteed constant-time operations.

Its tradeoff is memory. It always allocates space for all possible keys.

Direct Addressing Complexity

OperationTimeSpace
putO(1)O(1_000_001)
getO(1)O(1_000_001)
removeO(1)O(1_000_001)

Testing

def test_my_hash_map():
    hm = MyHashMap()

    hm.put(1, 1)
    hm.put(2, 2)

    assert hm.get(1) == 1
    assert hm.get(3) == -1

    hm.put(2, 1)
    assert hm.get(2) == 1

    hm.remove(2)
    assert hm.get(2) == -1

    hm.remove(42)
    assert hm.get(42) == -1

    hm.put(1009, 10)
    hm.put(2018, 20)

    assert hm.get(1009) == 10
    assert hm.get(2018) == 20

    hm.put(1009, 99)
    assert hm.get(1009) == 99

    hm.remove(1009)

    assert hm.get(1009) == -1
    assert hm.get(2018) == 20

    print("all tests passed")

test_my_hash_map()

Test coverage:

TestWhy
Insert and getConfirms basic mapping
Missing keyConfirms -1 result
Update existing keyConfirms overwrite behavior
Remove existing keyConfirms deletion
Remove missing keyConfirms no error
Colliding keysConfirms bucket collision handling
Update colliding keyConfirms correct pair is updated