# LeetCode 381: Insert Delete GetRandom O(1) - Duplicates Allowed

## Problem Restatement

We need to implement a data structure called `RandomizedCollection`.

It behaves like a multiset, so duplicate values are allowed.

It must support three operations in average `O(1)` time:

| Operation | Meaning |
|---|---|
| `insert(val)` | Insert one occurrence of `val` |
| `remove(val)` | Remove one occurrence of `val` if present |
| `getRandom()` | Return a random element from the current collection |

For `getRandom()`, every stored occurrence has equal probability. So if the collection contains:

```python
[1, 1, 2]
```

then `1` should be returned with probability `2 / 3`, and `2` should be returned with probability `1 / 3`.

`insert(val)` returns `True` only when `val` was not already present before this insertion. Otherwise, it returns `False`.

`remove(val)` returns `True` if one occurrence was removed. Otherwise, it returns `False`.

## Input and Output

| Method | Input | Output |
|---|---|---|
| `RandomizedCollection()` | None | Initializes the collection |
| `insert(val)` | Integer `val` | `True` if first occurrence, otherwise `False` |
| `remove(val)` | Integer `val` | `True` if one occurrence was removed, otherwise `False` |
| `getRandom()` | None | A random current element |

Example class shape:

```python
class RandomizedCollection:

    def __init__(self):
        ...

    def insert(self, val: int) -> bool:
        ...

    def remove(self, val: int) -> bool:
        ...

    def getRandom(self) -> int:
        ...
```

## Examples

Create an empty collection:

```python
collection = RandomizedCollection()
```

Insert `1`:

```python
collection.insert(1)
```

This returns:

```python
True
```

because `1` was not already present.

Insert another `1`:

```python
collection.insert(1)
```

This returns:

```python
False
```

because `1` was already present.

Now the collection is:

```python
[1, 1]
```

Insert `2`:

```python
collection.insert(2)
```

This returns:

```python
True
```

Now the collection is:

```python
[1, 1, 2]
```

Call:

```python
collection.getRandom()
```

The result should follow these probabilities:

| Value | Probability |
|---|---|
| `1` | `2 / 3` |
| `2` | `1 / 3` |

Remove one `1`:

```python
collection.remove(1)
```

This returns:

```python
True
```

Now the collection contains one `1` and one `2`.

## First Thought: Reuse RandomizedSet

LeetCode 380 used:

| Structure | Meaning |
|---|---|
| List | Store all values for random access |
| Hash map | Map each value to its index |

That works when values are unique.

For this problem, duplicates are allowed. A value may appear at many indices.

So this map is no longer enough:

```python
value -> index
```

We need:

```python
value -> set of indices
```

For example, if the array is:

```python
values = [1, 1, 2]
```

then the index map should be:

```python
{
    1: {0, 1},
    2: {2},
}
```

This lets us remove one occurrence of a value in average `O(1)` time.

## Key Insight

Use two structures:

| Structure | Meaning |
|---|---|
| `values` | A list containing every occurrence |
| `indices` | A map from value to the set of positions where it appears |

The list allows `getRandom()` in `O(1)` by choosing a random index.

The map of sets allows `insert()` and `remove()` to find and update occurrences in average `O(1)`.

The hard operation is `remove(val)`.

As in LeetCode 380, we avoid deleting from the middle of the list. We remove by swapping the target occurrence with the last occurrence, then popping the end.

## Swap and Pop With Duplicates

Suppose:

```python
values = [1, 1, 2, 3]
indices = {
    1: {0, 1},
    2: {2},
    3: {3},
}
```

Now remove one occurrence of `1`.

Pick one index of `1`, for example:

```python
remove_idx = 0
```

The last value is:

```python
last_val = 3
last_idx = 3
```

Move `3` into index `0`:

```python
values = [3, 1, 2, 3]
```

Now update the index sets:

```python
indices[1].remove(0)
indices[3].remove(3)
indices[3].add(0)
```

Then pop the last element:

```python
values = [3, 1, 2]
```

Now the structures are correct:

```python
indices = {
    1: {1},
    2: {2},
    3: {0},
}
```

## Algorithm

Initialize:

```python
values = []
indices = {}
```

For `insert(val)`:

1. Check whether `val` is already present.
2. Append `val` to `values`.
3. Add the new index to `indices[val]`.
4. Return `True` if `val` was absent before insertion, otherwise `False`.

For `remove(val)`:

1. If `val` is absent, return `False`.
2. Take one index from `indices[val]`.
3. Read the last value from `values`.
4. Move the last value into the removed index.
5. Update the index set for the last value.
6. Pop the last element from `values`.
7. If `indices[val]` becomes empty, delete it from the map.
8. Return `True`.

For `getRandom()`:

1. Pick a random element from `values`.
2. Return it.

## Correctness

The list `values` stores every current occurrence in the collection.

The map `indices` stores exactly the list positions for each value.

When `insert(val)` is called, the algorithm appends one new occurrence to `values` and records its new index in `indices[val]`. Therefore the list and map both include the new occurrence.

When `remove(val)` is called and `val` is absent, there is no occurrence to remove, so returning `False` is correct.

When `remove(val)` is called and `val` is present, the algorithm removes exactly one index from `indices[val]`. This represents removing one occurrence. To keep list deletion constant time, it moves the last list element into the removed index and updates that moved value's index set. Then it pops the last list slot.

After the swap and pop, every remaining occurrence still appears exactly once in `values`, and every index set in `indices` points to the correct positions.

If a value has no remaining indices, deleting the key keeps the map consistent with the collection.

`getRandom()` chooses uniformly among indices in `values`. Since each occurrence has exactly one slot in `values`, each occurrence has equal probability. Therefore values with more occurrences are returned with proportionally higher probability.

## Complexity

Let `n` be the number of stored occurrences.

| Operation | Time | Why |
|---|---|---|
| `insert(val)` | `O(1)` average | List append and hash set insert |
| `remove(val)` | `O(1)` average | Hash set pop, index updates, list pop |
| `getRandom()` | `O(1)` | Random list access |
| Space | `O(n)` | Store every occurrence and its index |

## Implementation

```python
import random
from collections import defaultdict

class RandomizedCollection:

    def __init__(self):
        self.values = []
        self.indices = defaultdict(set)

    def insert(self, val: int) -> bool:
        was_absent = len(self.indices[val]) == 0

        self.indices[val].add(len(self.values))
        self.values.append(val)

        return was_absent

    def remove(self, val: int) -> bool:
        if len(self.indices[val]) == 0:
            return False

        remove_idx = self.indices[val].pop()
        last_val = self.values[-1]
        last_idx = len(self.values) - 1

        self.values[remove_idx] = last_val

        self.indices[last_val].add(remove_idx)
        self.indices[last_val].discard(last_idx)

        self.values.pop()

        if len(self.indices[val]) == 0:
            del self.indices[val]

        return True

    def getRandom(self) -> int:
        return random.choice(self.values)
```

## Code Explanation

We store all occurrences in a list:

```python
self.values = []
```

We map each value to all indices where it appears:

```python
self.indices = defaultdict(set)
```

In `insert`, we check whether this value had no occurrences before:

```python
was_absent = len(self.indices[val]) == 0
```

Then we add the new index:

```python
self.indices[val].add(len(self.values))
self.values.append(val)
```

For removal, we first check whether the value exists:

```python
if len(self.indices[val]) == 0:
    return False
```

Then we remove one stored index for that value:

```python
remove_idx = self.indices[val].pop()
```

We read the last value and last index:

```python
last_val = self.values[-1]
last_idx = len(self.values) - 1
```

Move the last value into the removed index:

```python
self.values[remove_idx] = last_val
```

Then update the moved value's index set:

```python
self.indices[last_val].add(remove_idx)
self.indices[last_val].discard(last_idx)
```

Using `discard` is safe even when `remove_idx == last_idx`.

Finally, remove the last list slot:

```python
self.values.pop()
```

If the removed value no longer has occurrences, delete its key:

```python
if len(self.indices[val]) == 0:
    del self.indices[val]
```

For random selection:

```python
return random.choice(self.values)
```

Each list slot is equally likely, and duplicates occupy multiple slots.

## Testing

```python
def test_randomized_collection():
    c = RandomizedCollection()

    assert c.insert(1) is True
    assert c.insert(1) is False
    assert c.insert(2) is True

    assert sorted(c.values) == [1, 1, 2]

    assert c.remove(1) is True
    assert sorted(c.values) == [1, 2]

    assert c.remove(1) is True
    assert sorted(c.values) == [2]

    assert c.remove(1) is False

    assert c.getRandom() == 2

    assert c.insert(2) is False
    assert c.insert(3) is True

    assert sorted(c.values) == [2, 2, 3]

    assert c.remove(2) is True
    assert sorted(c.values) in ([2, 3], [2, 3])

    assert c.getRandom() in {2, 3}

    print("all tests passed")

test_randomized_collection()
```

Test meaning:

| Test | Why |
|---|---|
| Insert first `1` | First occurrence returns `True` |
| Insert duplicate `1` | Duplicate insertion returns `False` |
| Remove `1` once | Only one occurrence is removed |
| Remove `1` twice | Second occurrence can still be removed |
| Remove missing `1` | Missing value returns `False` |
| Single remaining value | `getRandom()` must return that value |
| Duplicate `2` | Confirms repeated values remain valid |
| Remove one duplicate | Confirms only one occurrence is deleted |