# LeetCode 757: Set Intersection Size At Least Two

## Problem Restatement

We are given a list of intervals.

Each interval is:

```python
[start, end]
```

We need to build a set `S` of integers such that every interval contains at least two numbers from `S`.

Our goal is to make `S` as small as possible.

Return the minimum possible size of `S`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A list of intervals `[start, end]` |
| Output | The minimum size of a set `S` |
| Requirement | Every interval must contain at least two integers from `S` |

Example function shape:

```python
def intersectionSizeTwo(intervals: list[list[int]]) -> int:
    ...
```

## Examples

Example 1:

```python
intervals = [[1, 3], [1, 4], [2, 5], [3, 5]]
```

Output:

```python
3
```

One valid set is:

```python
{2, 3, 5}
```

Check each interval:

| Interval | Numbers from `{2,3,5}` |
|---|---|
| `[1,3]` | `2,3` |
| `[1,4]` | `2,3` |
| `[2,5]` | `2,3,5` |
| `[3,5]` | `3,5` |

Every interval contains at least two numbers.

No set of size `2` can satisfy all intervals.

Example 2:

```python
intervals = [[1, 2], [2, 3], [2, 4], [4, 5]]
```

Output:

```python
5
```

One valid set is:

```python
{1, 2, 3, 4, 5}
```

## First Thought: Try All Possible Sets

One brute-force idea is to try choosing numbers and check whether every interval contains at least two of them.

But intervals may contain many integers, and there are exponentially many subsets.

We need a more structured strategy.

This is a greedy interval problem.

## Key Insight

Suppose we process intervals from left to right.

When we need to add numbers for an interval, we should place them as far right as possible.

Why?

Because numbers near the right end are more likely to also help future intervals.

This is the same general idea used in interval scheduling and covering problems.

## Sorting Strategy

We sort intervals by:

1. increasing end
2. decreasing start when ends are equal

Code:

```python
intervals.sort(key=lambda x: (x[1], -x[0]))
```

Why decreasing start for ties?

Suppose we have:

```text
[1, 5]
[3, 5]
```

The smaller interval `[3,5]` is more restrictive.

Processing it first avoids wasting points too early.

## Greedy State

For every processed interval, we only need to remember the two largest chosen points so far.

Call them:

```python
a < b
```

These are the two most recent selected numbers.

When processing a new interval `[start, end]`, there are three cases.

## Case 1: Both Points Already Inside

If:

```python
start <= a
```

then both `a` and `b` lie inside the interval.

So this interval already has at least two selected numbers.

We do nothing.

## Case 2: Only One Point Inside

If:

```python
a < start <= b
```

then only `b` lies inside the interval.

We need one more number.

To maximize usefulness for future intervals, choose the largest possible value:

```python
end
```

Then update:

```python
a = b
b = end
```

## Case 3: No Points Inside

If:

```python
start > b
```

then neither `a` nor `b` lies inside the interval.

We must add two new numbers.

Again, choose the largest possible ones:

```python
end - 1
end
```

Then update:

```python
a = end - 1
b = end
```

This greedy choice leaves maximum flexibility for future intervals.

## Why Choose the Largest Numbers?

Suppose an interval is:

```text
[start, end]
```

Any chosen number must lie inside it.

Numbers closer to `end` are more likely to overlap future intervals, because future intervals after sorting tend to end later.

So selecting the largest valid numbers preserves the most reuse.

## Algorithm

1. Sort intervals by:
   1. increasing `end`
   2. decreasing `start`
2. Initialize:
   1. `a = -1`
   2. `b = -1`
   3. `answer = 0`
3. Process each interval `[start, end]`:
   1. If `start <= a`, do nothing.
   2. Else if `start > b`:
      1. Add two numbers: `end - 1`, `end`
      2. Increase answer by `2`
      3. Update `a`, `b`
   3. Else:
      1. Add one number: `end`
      2. Increase answer by `1`
      3. Update `a`, `b`
4. Return `answer`.

## Correctness

The intervals are processed in increasing order of right endpoint. Therefore, when processing an interval `[start, end]`, every future interval ends at least as far right as `end`.

Suppose the current interval already contains two selected points. Then adding more points would only increase the size of the set unnecessarily, so skipping is optimal.

Suppose the current interval contains exactly one selected point. Then we must add one more point inside the interval. Choosing the largest possible point, namely `end`, is optimal because every future interval ends at or after `end`. Any smaller choice could only reduce the chance of helping future intervals.

Suppose the current interval contains no selected points. Then we must add two points inside the interval. Choosing the two largest possible points, `end - 1` and `end`, again maximizes the chance that future intervals also contain them.

Thus every greedy step makes the locally optimal choice that preserves maximum usefulness for future intervals.

Since the algorithm adds points only when required, and always chooses the most reusable points possible, the final number of selected points is minimal.

## Complexity

Let `n` be the number of intervals.

| Metric | Value | Why |
|---|---|---|
| Time | `O(n log n)` | Sorting dominates |
| Space | `O(1)` | Only a few variables are used after sorting |

## Implementation

```python
class Solution:
    def intersectionSizeTwo(self, intervals: list[list[int]]) -> int:
        intervals.sort(key=lambda x: (x[1], -x[0]))

        a = -1
        b = -1
        answer = 0

        for start, end in intervals:
            if start <= a:
                continue

            if start > b:
                answer += 2
                a = end - 1
                b = end
            else:
                answer += 1
                a = b
                b = end

        return answer
```

## Code Explanation

We first sort intervals:

```python
intervals.sort(key=lambda x: (x[1], -x[0]))
```

Intervals with smaller ending positions are processed first.

If two intervals end at the same place, the narrower one is processed first.

The variables:

```python
a
b
```

store the two largest selected numbers so far.

Initially:

```python
a = -1
b = -1
```

meaning no useful points exist yet.

When processing an interval:

```python
for start, end in intervals:
```

the first case is:

```python
if start <= a:
    continue
```

Since `a < b`, both numbers lie inside the interval.

So this interval already contains at least two selected points.

The second case is:

```python
if start > b:
```

Neither `a` nor `b` lies inside the interval.

So we must add two new numbers:

```python
a = end - 1
b = end
```

These are the largest possible choices.

The third case is the remaining one:

```python
a < start <= b
```

Exactly one selected point lies inside the interval.

So we add one more number:

```python
b = end
```

and shift the old `b` into `a`:

```python
a = b
```

The variable `answer` tracks the total number of chosen integers.

## Testing

```python
def run_tests():
    s = Solution()

    assert s.intersectionSizeTwo(
        [[1, 3], [1, 4], [2, 5], [3, 5]]
    ) == 3

    assert s.intersectionSizeTwo(
        [[1, 2], [2, 3], [2, 4], [4, 5]]
    ) == 5

    assert s.intersectionSizeTwo(
        [[1, 2]]
    ) == 2

    assert s.intersectionSizeTwo(
        [[1, 5], [2, 5], [3, 5]]
    ) == 2

    assert s.intersectionSizeTwo(
        [[1, 10], [2, 9], [3, 8], [4, 7]]
    ) == 2

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| Main example | Standard overlapping intervals |
| Sparse overlap case | Forces many insertions |
| Single interval | Must choose exactly two numbers |
| Same ending point | Confirms sorting tie-break rule |
| Nested intervals | Two numbers can satisfy all intervals |