LeetCode 287: Find the Duplicate Number

Problem Restatement

We are given an array nums containing n + 1 integers.

Each integer is in the range:

1 <= nums[i] <= n

Since there are n + 1 numbers but only n possible values, at least one number must repeat.

The problem says there is exactly one repeated number, but it may appear more than twice.

We need to return that repeated number.

We must solve it without modifying nums and using only constant extra space. The official constraints include nums.length == n + 1, values in [1, n], and exactly one integer appearing two or more times.

Input and Output

Item	Meaning
Input	Integer array `nums`
Output	The repeated number
Length	`nums.length == n + 1`
Value range	Every value is from `1` to `n`
Requirement	Do not modify `nums`
Space	Use `O(1)` extra space

Function shape:

class Solution:
    def findDuplicate(self, nums: list[int]) -> int:
        ...

Examples

For:

nums = [1, 3, 4, 2, 2]

The repeated number is:

So the answer is:

For:

nums = [3, 1, 3, 4, 2]

The repeated number is:

For:

nums = [3, 3, 3, 3, 3]

The repeated number is still:

First Thought: Use a Set

A simple solution is to scan the array and remember values we have already seen.

class Solution:
    def findDuplicate(self, nums: list[int]) -> int:
        seen = set()

        for num in nums:
            if num in seen:
                return num

            seen.add(num)

        return -1

This is correct.

When we see a number for the second time, that number must be the duplicate.

Problem With the Set Solution

The set solution uses extra memory.

In the worst case, it may store almost all values before finding the duplicate.

So the space complexity is:

O(n)

The problem asks for constant extra space.

Sorting also does not satisfy the main requirement because sorting modifies the array unless we copy it, and copying also uses extra space.

We need a method that reads the array without changing it.

Key Insight

Treat the array like a linked list.

Each index points to another index:

next_index = nums[index]

This works because every value in nums is between 1 and n, so every nums[index] is a valid index inside the array.

The duplicate creates a cycle.

Example:

nums = [1, 3, 4, 2, 2]

Start at index 0.

Follow the values:

0 -> nums[0] = 1
1 -> nums[1] = 3
3 -> nums[3] = 2
2 -> nums[2] = 4
4 -> nums[4] = 2
2 -> nums[2] = 4
4 -> nums[4] = 2
...

We enter the cycle:

2 -> 4 -> 2

The entrance to this cycle is the duplicate number.

So the problem becomes:

Find the cycle entrance in a linked list.

We can use Floyd’s cycle detection algorithm.

Algorithm

Use two pointers:

Pointer	Movement
`slow`	Moves one step
`fast`	Moves two steps

First phase: find an intersection inside the cycle.

slow = nums[0]
fast = nums[0]

while True:
    slow = nums[slow]
    fast = nums[nums[fast]]

    if slow == fast:
        break

Second phase: find the cycle entrance.

Move one pointer back to the start.

slow = nums[0]

Then move both pointers one step at a time:

slow = nums[slow]
fast = nums[fast]

When they meet, that value is the duplicate.

Correctness

The array defines a directed graph where each index has exactly one outgoing edge.

Because there are n + 1 indices and values point into the range 1..n, following pointers from index 0 must eventually repeat an index. Therefore the path enters a cycle.

The repeated number is the entrance to that cycle. Multiple indices point to the same value, so that value is where two paths merge. In linked-list terms, it is the first node of the cycle.

In the first phase, Floyd’s algorithm moves slow one step and fast two steps. Since both pointers eventually enter the cycle, the faster pointer must catch the slower pointer somewhere inside the cycle.

In the second phase, one pointer starts from the beginning of the path and the other starts from the meeting point. Moving both one step at a time makes them meet exactly at the cycle entrance.

That entrance is the duplicate value. Therefore the algorithm returns the repeated number.

Complexity

Metric	Value	Why
Time	`O(n)`	Each pointer moves at most a linear number of steps
Space	`O(1)`	Only two pointers are stored
Modifies input	No	The array is only read

Implementation

class Solution:
    def findDuplicate(self, nums: list[int]) -> int:
        slow = nums[0]
        fast = nums[0]

        while True:
            slow = nums[slow]
            fast = nums[nums[fast]]

            if slow == fast:
                break

        slow = nums[0]

        while slow != fast:
            slow = nums[slow]
            fast = nums[fast]

        return slow

Code Explanation

We initialize both pointers from nums[0]:

slow = nums[0]
fast = nums[0]

Then we move them at different speeds:

slow = nums[slow]
fast = nums[nums[fast]]

slow takes one jump.

fast takes two jumps.

When they meet:

if slow == fast:
    break

we know they are inside the cycle.

Then we reset slow to the start of the path:

slow = nums[0]

Now both pointers move one step at a time:

slow = nums[slow]
fast = nums[fast]

The place where they meet is the cycle entrance.

That entrance is the duplicate number:

return slow

Testing

def test_find_duplicate():
    s = Solution()

    assert s.findDuplicate([1, 3, 4, 2, 2]) == 2
    assert s.findDuplicate([3, 1, 3, 4, 2]) == 3
    assert s.findDuplicate([3, 3, 3, 3, 3]) == 3
    assert s.findDuplicate([1, 1]) == 1
    assert s.findDuplicate([2, 2, 2, 2, 2]) == 2

    print("all tests passed")

test_find_duplicate()

Test meaning:

Test	Why
`[1, 3, 4, 2, 2]`	Standard example
`[3, 1, 3, 4, 2]`	Duplicate near the front
`[3, 3, 3, 3, 3]`	Duplicate appears many times
`[1, 1]`	Smallest valid case
`[2, 2, 2, 2, 2]`	Same duplicate repeated throughout