Skip to content

LeetCode 381: Insert Delete GetRandom O(1) - Duplicates Allowed

A clear explanation of designing a randomized multiset with average O(1) insert, remove, and getRandom operations.

Problem Restatement

We need to implement a data structure called RandomizedCollection.

It behaves like a multiset, so duplicate values are allowed.

It must support three operations in average O(1) time:

OperationMeaning
insert(val)Insert one occurrence of val
remove(val)Remove one occurrence of val if present
getRandom()Return a random element from the current collection

For getRandom(), every stored occurrence has equal probability. So if the collection contains:

[1, 1, 2]

then 1 should be returned with probability 2 / 3, and 2 should be returned with probability 1 / 3.

insert(val) returns True only when val was not already present before this insertion. Otherwise, it returns False.

remove(val) returns True if one occurrence was removed. Otherwise, it returns False.

Input and Output

MethodInputOutput
RandomizedCollection()NoneInitializes the collection
insert(val)Integer valTrue if first occurrence, otherwise False
remove(val)Integer valTrue if one occurrence was removed, otherwise False
getRandom()NoneA random current element

Example class shape:

class RandomizedCollection:

    def __init__(self):
        ...

    def insert(self, val: int) -> bool:
        ...

    def remove(self, val: int) -> bool:
        ...

    def getRandom(self) -> int:
        ...

Examples

Create an empty collection:

collection = RandomizedCollection()

Insert 1:

collection.insert(1)

This returns:

True

because 1 was not already present.

Insert another 1:

collection.insert(1)

This returns:

False

because 1 was already present.

Now the collection is:

[1, 1]

Insert 2:

collection.insert(2)

This returns:

True

Now the collection is:

[1, 1, 2]

Call:

collection.getRandom()

The result should follow these probabilities:

ValueProbability
12 / 3
21 / 3

Remove one 1:

collection.remove(1)

This returns:

True

Now the collection contains one 1 and one 2.

First Thought: Reuse RandomizedSet

LeetCode 380 used:

StructureMeaning
ListStore all values for random access
Hash mapMap each value to its index

That works when values are unique.

For this problem, duplicates are allowed. A value may appear at many indices.

So this map is no longer enough:

value -> index

We need:

value -> set of indices

For example, if the array is:

values = [1, 1, 2]

then the index map should be:

{
    1: {0, 1},
    2: {2},
}

This lets us remove one occurrence of a value in average O(1) time.

Key Insight

Use two structures:

StructureMeaning
valuesA list containing every occurrence
indicesA map from value to the set of positions where it appears

The list allows getRandom() in O(1) by choosing a random index.

The map of sets allows insert() and remove() to find and update occurrences in average O(1).

The hard operation is remove(val).

As in LeetCode 380, we avoid deleting from the middle of the list. We remove by swapping the target occurrence with the last occurrence, then popping the end.

Swap and Pop With Duplicates

Suppose:

values = [1, 1, 2, 3]
indices = {
    1: {0, 1},
    2: {2},
    3: {3},
}

Now remove one occurrence of 1.

Pick one index of 1, for example:

remove_idx = 0

The last value is:

last_val = 3
last_idx = 3

Move 3 into index 0:

values = [3, 1, 2, 3]

Now update the index sets:

indices[1].remove(0)
indices[3].remove(3)
indices[3].add(0)

Then pop the last element:

values = [3, 1, 2]

Now the structures are correct:

indices = {
    1: {1},
    2: {2},
    3: {0},
}

Algorithm

Initialize:

values = []
indices = {}

For insert(val):

  1. Check whether val is already present.
  2. Append val to values.
  3. Add the new index to indices[val].
  4. Return True if val was absent before insertion, otherwise False.

For remove(val):

  1. If val is absent, return False.
  2. Take one index from indices[val].
  3. Read the last value from values.
  4. Move the last value into the removed index.
  5. Update the index set for the last value.
  6. Pop the last element from values.
  7. If indices[val] becomes empty, delete it from the map.
  8. Return True.

For getRandom():

  1. Pick a random element from values.
  2. Return it.

Correctness

The list values stores every current occurrence in the collection.

The map indices stores exactly the list positions for each value.

When insert(val) is called, the algorithm appends one new occurrence to values and records its new index in indices[val]. Therefore the list and map both include the new occurrence.

When remove(val) is called and val is absent, there is no occurrence to remove, so returning False is correct.

When remove(val) is called and val is present, the algorithm removes exactly one index from indices[val]. This represents removing one occurrence. To keep list deletion constant time, it moves the last list element into the removed index and updates that moved value’s index set. Then it pops the last list slot.

After the swap and pop, every remaining occurrence still appears exactly once in values, and every index set in indices points to the correct positions.

If a value has no remaining indices, deleting the key keeps the map consistent with the collection.

getRandom() chooses uniformly among indices in values. Since each occurrence has exactly one slot in values, each occurrence has equal probability. Therefore values with more occurrences are returned with proportionally higher probability.

Complexity

Let n be the number of stored occurrences.

OperationTimeWhy
insert(val)O(1) averageList append and hash set insert
remove(val)O(1) averageHash set pop, index updates, list pop
getRandom()O(1)Random list access
SpaceO(n)Store every occurrence and its index

Implementation

import random
from collections import defaultdict

class RandomizedCollection:

    def __init__(self):
        self.values = []
        self.indices = defaultdict(set)

    def insert(self, val: int) -> bool:
        was_absent = len(self.indices[val]) == 0

        self.indices[val].add(len(self.values))
        self.values.append(val)

        return was_absent

    def remove(self, val: int) -> bool:
        if len(self.indices[val]) == 0:
            return False

        remove_idx = self.indices[val].pop()
        last_val = self.values[-1]
        last_idx = len(self.values) - 1

        self.values[remove_idx] = last_val

        self.indices[last_val].add(remove_idx)
        self.indices[last_val].discard(last_idx)

        self.values.pop()

        if len(self.indices[val]) == 0:
            del self.indices[val]

        return True

    def getRandom(self) -> int:
        return random.choice(self.values)

Code Explanation

We store all occurrences in a list:

self.values = []

We map each value to all indices where it appears:

self.indices = defaultdict(set)

In insert, we check whether this value had no occurrences before:

was_absent = len(self.indices[val]) == 0

Then we add the new index:

self.indices[val].add(len(self.values))
self.values.append(val)

For removal, we first check whether the value exists:

if len(self.indices[val]) == 0:
    return False

Then we remove one stored index for that value:

remove_idx = self.indices[val].pop()

We read the last value and last index:

last_val = self.values[-1]
last_idx = len(self.values) - 1

Move the last value into the removed index:

self.values[remove_idx] = last_val

Then update the moved value’s index set:

self.indices[last_val].add(remove_idx)
self.indices[last_val].discard(last_idx)

Using discard is safe even when remove_idx == last_idx.

Finally, remove the last list slot:

self.values.pop()

If the removed value no longer has occurrences, delete its key:

if len(self.indices[val]) == 0:
    del self.indices[val]

For random selection:

return random.choice(self.values)

Each list slot is equally likely, and duplicates occupy multiple slots.

Testing

def test_randomized_collection():
    c = RandomizedCollection()

    assert c.insert(1) is True
    assert c.insert(1) is False
    assert c.insert(2) is True

    assert sorted(c.values) == [1, 1, 2]

    assert c.remove(1) is True
    assert sorted(c.values) == [1, 2]

    assert c.remove(1) is True
    assert sorted(c.values) == [2]

    assert c.remove(1) is False

    assert c.getRandom() == 2

    assert c.insert(2) is False
    assert c.insert(3) is True

    assert sorted(c.values) == [2, 2, 3]

    assert c.remove(2) is True
    assert sorted(c.values) in ([2, 3], [2, 3])

    assert c.getRandom() in {2, 3}

    print("all tests passed")

test_randomized_collection()

Test meaning:

TestWhy
Insert first 1First occurrence returns True
Insert duplicate 1Duplicate insertion returns False
Remove 1 onceOnly one occurrence is removed
Remove 1 twiceSecond occurrence can still be removed
Remove missing 1Missing value returns False
Single remaining valuegetRandom() must return that value
Duplicate 2Confirms repeated values remain valid
Remove one duplicateConfirms only one occurrence is deleted