# LeetCode 128: Longest Consecutive Sequence

## Problem Restatement

We are given an unsorted integer array `nums`.

We need to return the length of the longest consecutive sequence of numbers.

The numbers do not need to be adjacent in the array. They only need to exist somewhere in the array.

The problem also requires an `O(n)` time algorithm. Sorting would take `O(n log n)`, so sorting does not satisfy the required bound.

## Examples

Example 1:

```python
nums = [100, 4, 200, 1, 3, 2]
```

The longest consecutive sequence is:

```python
[1, 2, 3, 4]
```

Its length is:

```python
4
```

So the output is:

```python
4
```

Example 2:

```python
nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]
```

The longest consecutive sequence is:

```python
[0, 1, 2, 3, 4, 5, 6, 7, 8]
```

Its length is:

```python
9
```

So the output is:

```python
9
```

Example 3:

```python
nums = [1, 0, 1, 2]
```

The longest consecutive sequence is:

```python
[0, 1, 2]
```

Even though `1` appears twice, duplicates do not extend the sequence.

The output is:

```python
3
```

## Input and Output

| Item | Meaning |
|---|---|
| Input | `nums: list[int]` |
| Output | Length of the longest consecutive integer sequence |
| Important detail | The array is unsorted |
| Required time | `O(n)` |

Function shape:

```python
class Solution:
    def longestConsecutive(self, nums: list[int]) -> int:
        ...
```

## First Thought: Sort the Array

A natural idea is to sort the array.

For example:

```python
nums = [100, 4, 200, 1, 3, 2]
```

After sorting:

```python
[1, 2, 3, 4, 100, 200]
```

Then we can scan from left to right and count consecutive numbers.

This is easy, but sorting costs:

```text
O(n log n)
```

The problem requires `O(n)`, so we need another method.

## Key Insight

We need fast membership checks.

For any number `x`, we want to quickly ask:

```python
x in nums
```

A hash set gives expected `O(1)` lookup.

So we store all numbers in a set:

```python
num_set = set(nums)
```

Now we can test whether `x + 1`, `x + 2`, and so on exist.

But we should not start counting from every number.

For example, in this set:

```python
{1, 2, 3, 4}
```

Starting from `1` is useful.

Starting from `2` repeats work, because `2` is already inside the sequence that starts at `1`.

So we only start counting from a number `x` when `x - 1` does not exist.

That means `x` is the beginning of a sequence.

## Sequence Start Rule

A number starts a consecutive sequence if its previous number is missing.

```python
if num - 1 not in num_set:
    # num is the start of a sequence
```

Example:

```python
nums = [100, 4, 200, 1, 3, 2]
num_set = {1, 2, 3, 4, 100, 200}
```

Check each number:

| Number | Previous exists? | Start? |
|---:|---:|---:|
| `100` | `99` missing | Yes |
| `4` | `3` exists | No |
| `200` | `199` missing | Yes |
| `1` | `0` missing | Yes |
| `3` | `2` exists | No |
| `2` | `1` exists | No |

Only `100`, `200`, and `1` are sequence starts.

From `1`, we count:

```text
1 -> 2 -> 3 -> 4
```

Length is `4`.

## Algorithm

Create a set from `nums`.

Initialize `best = 0`.

For each number `num` in the set:

1. Check whether `num - 1` exists.
2. If it exists, skip `num` because it is not the beginning of a sequence.
3. If it does not exist, start a new sequence from `num`.
4. Count upward while `num + length` exists.
5. Update `best`.

Return `best`.

## Correctness

Every consecutive sequence has exactly one smallest number.

For the sequence:

```python
[1, 2, 3, 4]
```

the smallest number is `1`.

Only `1` satisfies:

```python
1 - 1 not in num_set
```

The other numbers do not satisfy the start rule:

```python
2 - 1 in num_set
3 - 1 in num_set
4 - 1 in num_set
```

So the algorithm counts each sequence once, starting from its smallest number.

When it starts from that smallest number, it checks `x + 1`, `x + 2`, and so on until the next number is missing. Therefore, it counts the full length of that consecutive sequence.

Since every sequence is counted once and `best` stores the maximum length found, the algorithm returns the length of the longest consecutive sequence.

## Complexity

Let `n` be the length of `nums`.

| Metric | Value | Why |
|---|---:|---|
| Time | `O(n)` | Each number is used in constant expected-time set operations |
| Space | `O(n)` | The set stores unique numbers |

Although there is a nested `while` loop, the total work is linear.

The `while` loop only runs from sequence starts. Across all starts, each unique number belongs to one counted sequence.

## Implementation

```python
class Solution:
    def longestConsecutive(self, nums: list[int]) -> int:
        num_set = set(nums)
        best = 0

        for num in num_set:
            if num - 1 in num_set:
                continue

            length = 1

            while num + length in num_set:
                length += 1

            best = max(best, length)

        return best
```

## Code Explanation

We first put every number into a set:

```python
num_set = set(nums)
```

This also removes duplicates.

For example:

```python
[1, 0, 1, 2]
```

becomes:

```python
{0, 1, 2}
```

Then we scan through unique numbers:

```python
for num in num_set:
```

If the previous number exists, this number is somewhere in the middle of a sequence:

```python
if num - 1 in num_set:
    continue
```

So we skip it.

If the previous number does not exist, `num` is the start of a sequence.

We count forward:

```python
length = 1

while num + length in num_set:
    length += 1
```

For `num = 1` and `num_set = {1, 2, 3, 4, 100, 200}`, this checks:

```text
2 exists
3 exists
4 exists
5 missing
```

So the length is `4`.

Finally, we keep the largest length:

```python
best = max(best, length)
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.longestConsecutive([100, 4, 200, 1, 3, 2]) == 4
    assert s.longestConsecutive([0, 3, 7, 2, 5, 8, 4, 6, 0, 1]) == 9
    assert s.longestConsecutive([1, 0, 1, 2]) == 3
    assert s.longestConsecutive([]) == 0
    assert s.longestConsecutive([5]) == 1
    assert s.longestConsecutive([10, 30, 20]) == 1
    assert s.longestConsecutive([-2, -1, 0, 1, 5]) == 4

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| `[100, 4, 200, 1, 3, 2]` | Standard unordered case |
| `[0, 3, 7, 2, 5, 8, 4, 6, 0, 1]` | Long sequence with duplicate |
| `[1, 0, 1, 2]` | Duplicate value should not extend length |
| `[]` | Empty input |
| `[5]` | Single element |
| `[10, 30, 20]` | No consecutive pair |
| `[-2, -1, 0, 1, 5]` | Negative numbers |

