# LeetCode 217: Contains Duplicate

## Problem Restatement

We are given an integer array `nums`.

We need to determine whether any value appears at least twice.

Return:

| Return value | Meaning |
|---|---|
| `True` | Some value appears more than once |
| `False` | Every value is distinct |

The official statement asks us to return `true` if any value appears at least twice in the array, and `false` if every element is distinct. ([leetcode.com](https://leetcode.com/problems/contains-duplicate/?utm_source=chatgpt.com))

## Input and Output

| Item | Meaning |
|---|---|
| Input | Integer array `nums` |
| Output | Boolean |
| `True` | Duplicate exists |
| `False` | All elements are unique |

Example function shape:

```python
def containsDuplicate(nums: list[int]) -> bool:
    ...
```

## Examples

Example 1:

```python
nums = [1, 2, 3, 1]
```

The value `1` appears twice.

Answer:

```python
True
```

Example 2:

```python
nums = [1, 2, 3, 4]
```

All values are distinct.

Answer:

```python
False
```

Example 3:

```python
nums = [1, 1, 1, 3, 3, 4, 3, 2, 4, 2]
```

Several values repeat.

Answer:

```python
True
```

## First Thought

We could compare every pair of elements.

For each index `i`, check every later index `j`.

```python
for i in range(len(nums)):
    for j in range(i + 1, len(nums)):
        if nums[i] == nums[j]:
            return True
```

This works, but it is slow.

## Problem With Brute Force

For an array of length `n`, the nested loops may compare:

```text
n * (n - 1) / 2
```

pairs.

That gives:

```text
O(n^2)
```

time complexity.

We can do much better using a hash set.

## Key Insight

A set supports fast membership checking.

While scanning the array:

- if we see a number for the first time, add it to the set
- if the number already exists in the set, we found a duplicate

This lets us detect duplicates in one pass.

## Algorithm

1. Create an empty set `seen`.
2. Scan the array from left to right.
3. For each number:
   - If it already exists in `seen`, return `True`.
   - Otherwise add it to `seen`.
4. If the scan finishes, return `False`.

## Walkthrough

Use:

```python
nums = [1, 2, 3, 1]
```

Start:

```python
seen = set()
```

Process `1`:

```text
1 not in seen
```

Add it:

```python
seen = {1}
```

Process `2`:

```python
seen = {1, 2}
```

Process `3`:

```python
seen = {1, 2, 3}
```

Process `1` again:

```text
1 already exists in seen
```

Return:

```python
True
```

## Correctness

The set `seen` always contains exactly the distinct values encountered earlier in the array.

When processing a number:

- if the number already exists in `seen`, then the current occurrence and the earlier occurrence form a duplicate pair, so returning `True` is correct
- otherwise the number has not appeared before, so adding it to `seen` preserves the invariant

If the algorithm finishes scanning the array without finding a repeated value, then every element appeared exactly once, so returning `False` is correct.

Therefore the algorithm correctly determines whether the array contains duplicates.

## Complexity

| Metric | Value | Why |
|---|---:|---|
| Time | `O(n)` average | Each set lookup and insertion is average `O(1)` |
| Space | `O(n)` | The set may store all elements |

## Hash Set Implementation

```python
class Solution:
    def containsDuplicate(self, nums: list[int]) -> bool:
        seen = set()

        for num in nums:
            if num in seen:
                return True

            seen.add(num)

        return False
```

## Code Explanation

Create the set:

```python
seen = set()
```

Scan the array:

```python
for num in nums:
```

Check whether the value already appeared:

```python
if num in seen:
    return True
```

Otherwise store it:

```python
seen.add(num)
```

If no duplicate is found:

```python
return False
```

## Shorter Python Version

Python sets automatically remove duplicates.

So:

```python
len(set(nums))
```

gives the number of distinct elements.

If duplicates exist, the set becomes smaller than the original array.

```python
class Solution:
    def containsDuplicate(self, nums: list[int]) -> bool:
        return len(nums) != len(set(nums))
```

This is concise and commonly used in interviews.

## Sorting Solution

Another approach is sorting.

After sorting, duplicates become adjacent.

Example:

```python
nums = [3, 1, 2, 1]
```

Sorted:

```python
[1, 1, 2, 3]
```

Now duplicates are easy to detect.

```python
class Solution:
    def containsDuplicate(self, nums: list[int]) -> bool:
        nums.sort()

        for i in range(1, len(nums)):
            if nums[i] == nums[i - 1]:
                return True

        return False
```

## Sorting Complexity

| Metric | Value |
|---|---:|
| Time | `O(n log n)` |
| Extra space | Depends on sorting implementation |

The hash set solution is usually preferred because it is linear time.

## Testing

```python
def run_tests():
    s = Solution()

    assert s.containsDuplicate([1, 2, 3, 1]) is True
    assert s.containsDuplicate([1, 2, 3, 4]) is False
    assert s.containsDuplicate([1, 1, 1, 3, 3, 4, 3, 2, 4, 2]) is True
    assert s.containsDuplicate([]) is False
    assert s.containsDuplicate([5]) is False
    assert s.containsDuplicate([-1, -2, -3, -1]) is True

    print("all tests passed")

run_tests()
```

| Test | Why |
|---|---|
| `[1,2,3,1]` | Simple duplicate |
| `[1,2,3,4]` | All unique |
| Multiple repeated values | Many duplicates |
| `[]` | Empty array |
| `[5]` | Single element |
| Negative values | Works for all integers |

