LeetCode 128: Longest Consecutive Sequence

Problem Restatement

We are given an unsorted integer array nums.

We need to return the length of the longest consecutive sequence of numbers.

The numbers do not need to be adjacent in the array. They only need to exist somewhere in the array.

The problem also requires an O(n) time algorithm. Sorting would take O(n log n), so sorting does not satisfy the required bound.

Examples

Example 1:

nums = [100, 4, 200, 1, 3, 2]

The longest consecutive sequence is:

[1, 2, 3, 4]

Its length is:

So the output is:

Example 2:

nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]

The longest consecutive sequence is:

[0, 1, 2, 3, 4, 5, 6, 7, 8]

Its length is:

So the output is:

Example 3:

nums = [1, 0, 1, 2]

The longest consecutive sequence is:

[0, 1, 2]

Even though 1 appears twice, duplicates do not extend the sequence.

The output is:

Input and Output

Item	Meaning
Input	`nums: list[int]`
Output	Length of the longest consecutive integer sequence
Important detail	The array is unsorted
Required time	`O(n)`

Function shape:

class Solution:
    def longestConsecutive(self, nums: list[int]) -> int:
        ...

First Thought: Sort the Array

A natural idea is to sort the array.

For example:

nums = [100, 4, 200, 1, 3, 2]

After sorting:

[1, 2, 3, 4, 100, 200]

Then we can scan from left to right and count consecutive numbers.

This is easy, but sorting costs:

O(n log n)

The problem requires O(n), so we need another method.

Key Insight

We need fast membership checks.

For any number x, we want to quickly ask:

x in nums

A hash set gives expected O(1) lookup.

So we store all numbers in a set:

num_set = set(nums)

Now we can test whether x + 1, x + 2, and so on exist.

But we should not start counting from every number.

For example, in this set:

{1, 2, 3, 4}

Starting from 1 is useful.

Starting from 2 repeats work, because 2 is already inside the sequence that starts at 1.

So we only start counting from a number x when x - 1 does not exist.

That means x is the beginning of a sequence.

Sequence Start Rule

A number starts a consecutive sequence if its previous number is missing.

if num - 1 not in num_set:
    # num is the start of a sequence

Example:

nums = [100, 4, 200, 1, 3, 2]
num_set = {1, 2, 3, 4, 100, 200}

Check each number:

Number	Previous exists?	Start?
`100`	`99` missing	Yes
`4`	`3` exists	No
`200`	`199` missing	Yes
`1`	`0` missing	Yes
`3`	`2` exists	No
`2`	`1` exists	No

Only 100, 200, and 1 are sequence starts.

From 1, we count:

1 -> 2 -> 3 -> 4

Length is 4.

Algorithm

Create a set from nums.

Initialize best = 0.

For each number num in the set:

Check whether num - 1 exists.
If it exists, skip num because it is not the beginning of a sequence.
If it does not exist, start a new sequence from num.
Count upward while num + length exists.
Update best.

Return best.

Correctness

Every consecutive sequence has exactly one smallest number.

For the sequence:

[1, 2, 3, 4]

the smallest number is 1.

Only 1 satisfies:

1 - 1 not in num_set

The other numbers do not satisfy the start rule:

2 - 1 in num_set
3 - 1 in num_set
4 - 1 in num_set

So the algorithm counts each sequence once, starting from its smallest number.

When it starts from that smallest number, it checks x + 1, x + 2, and so on until the next number is missing. Therefore, it counts the full length of that consecutive sequence.

Since every sequence is counted once and best stores the maximum length found, the algorithm returns the length of the longest consecutive sequence.

Complexity

Let n be the length of nums.

Metric	Value	Why
Time	`O(n)`	Each number is used in constant expected-time set operations
Space	`O(n)`	The set stores unique numbers

Although there is a nested while loop, the total work is linear.

The while loop only runs from sequence starts. Across all starts, each unique number belongs to one counted sequence.

Implementation

class Solution:
    def longestConsecutive(self, nums: list[int]) -> int:
        num_set = set(nums)
        best = 0

        for num in num_set:
            if num - 1 in num_set:
                continue

            length = 1

            while num + length in num_set:
                length += 1

            best = max(best, length)

        return best

Code Explanation

We first put every number into a set:

num_set = set(nums)

This also removes duplicates.

For example:

[1, 0, 1, 2]

becomes:

{0, 1, 2}

Then we scan through unique numbers:

for num in num_set:

If the previous number exists, this number is somewhere in the middle of a sequence:

if num - 1 in num_set:
    continue

So we skip it.

If the previous number does not exist, num is the start of a sequence.

We count forward:

length = 1

while num + length in num_set:
    length += 1

For num = 1 and num_set = {1, 2, 3, 4, 100, 200}, this checks:

2 exists
3 exists
4 exists
5 missing

So the length is 4.

Finally, we keep the largest length:

best = max(best, length)

Testing

def run_tests():
    s = Solution()

    assert s.longestConsecutive([100, 4, 200, 1, 3, 2]) == 4
    assert s.longestConsecutive([0, 3, 7, 2, 5, 8, 4, 6, 0, 1]) == 9
    assert s.longestConsecutive([1, 0, 1, 2]) == 3
    assert s.longestConsecutive([]) == 0
    assert s.longestConsecutive([5]) == 1
    assert s.longestConsecutive([10, 30, 20]) == 1
    assert s.longestConsecutive([-2, -1, 0, 1, 5]) == 4

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`[100, 4, 200, 1, 3, 2]`	Standard unordered case
`[0, 3, 7, 2, 5, 8, 4, 6, 0, 1]`	Long sequence with duplicate
`[1, 0, 1, 2]`	Duplicate value should not extend length
`[]`	Empty input
`[5]`	Single element
`[10, 30, 20]`	No consecutive pair
`[-2, -1, 0, 1, 5]`	Negative numbers