# LeetCode 352: Data Stream as Disjoint Intervals

## Problem Restatement

We need to design a data structure called `SummaryRanges`.

It receives non-negative integers from a data stream, one number at a time.

After each insertion, it should be able to return all numbers seen so far as a sorted list of disjoint intervals.

For example, if the stream gives:

```python
1, 3, 7, 2, 6
```

Then the intervals change like this:

```python
after 1: [[1, 1]]
after 3: [[1, 1], [3, 3]]
after 7: [[1, 1], [3, 3], [7, 7]]
after 2: [[1, 3], [7, 7]]
after 6: [[1, 3], [6, 7]]
```

The number `2` connects `[1, 1]` and `[3, 3]`, so they become `[1, 3]`.

The number `6` extends `[7, 7]` to `[6, 7]`.

## Input and Output

| Method | Meaning |
|---|---|
| `SummaryRanges()` | Initializes an empty stream |
| `addNum(value)` | Adds one integer to the stream |
| `getIntervals()` | Returns the current disjoint intervals sorted by start |

The output of `getIntervals()` is a list of intervals:

```python
[[start1, end1], [start2, end2], ...]
```

Each interval is inclusive.

## Examples

Start with an empty object:

```python
summaryRanges = SummaryRanges()
```

Add `1`:

```python
summaryRanges.addNum(1)
summaryRanges.getIntervals()
```

Output:

```python
[[1, 1]]
```

Add `3`:

```python
summaryRanges.addNum(3)
summaryRanges.getIntervals()
```

Output:

```python
[[1, 1], [3, 3]]
```

Add `7`:

```python
summaryRanges.addNum(7)
summaryRanges.getIntervals()
```

Output:

```python
[[1, 1], [3, 3], [7, 7]]
```

Add `2`:

```python
summaryRanges.addNum(2)
summaryRanges.getIntervals()
```

Output:

```python
[[1, 3], [7, 7]]
```

Add `6`:

```python
summaryRanges.addNum(6)
summaryRanges.getIntervals()
```

Output:

```python
[[1, 3], [6, 7]]
```

## First Thought: Store All Numbers

The simplest design is to store every number in a set.

When `getIntervals()` is called:

1. Sort all numbers.
2. Scan them from left to right.
3. Merge consecutive numbers into intervals.

Code idea:

```python
class SummaryRanges:
    def __init__(self):
        self.values = set()

    def addNum(self, value: int) -> None:
        self.values.add(value)

    def getIntervals(self) -> list[list[int]]:
        nums = sorted(self.values)
        result = []

        for x in nums:
            if not result or result[-1][1] + 1 < x:
                result.append([x, x])
            else:
                result[-1][1] = x

        return result
```

This is simple and correct.

But every call to `getIntervals()` sorts all numbers again.

If there are many queries, this becomes expensive.

## Problem With Brute Force

Suppose we have `k` unique numbers.

| Operation | Cost |
|---|---|
| `addNum` | `O(1)` average |
| `getIntervals` | `O(k log k)` |

This design delays the work until query time.

A better design maintains the intervals incrementally when each number is added.

Then `getIntervals()` can simply return the current interval list.

## Key Insight

The intervals are always sorted and disjoint.

When we insert a new value, it can only interact with the intervals immediately around its position.

There are four cases.

Case 1: the value is already inside an interval.

```python
intervals = [[1, 5]]
value = 3
```

Nothing changes.

Case 2: the value extends the left interval.

```python
intervals = [[1, 3]]
value = 4
```

Result:

```python
[[1, 4]]
```

Case 3: the value extends the right interval.

```python
intervals = [[5, 8]]
value = 4
```

Result:

```python
[[4, 8]]
```

Case 4: the value connects two intervals.

```python
intervals = [[1, 2], [4, 5]]
value = 3
```

Result:

```python
[[1, 5]]
```

So for every new value, we only need to find its insertion position among the current intervals.

## Algorithm

Maintain a sorted list:

```python
self.intervals
```

Each item is:

```python
[start, end]
```

For `addNum(value)`:

1. Use binary search to find the first interval whose start is greater than `value`.
2. Let `i` be that position.
3. The possible left neighbor is `i - 1`.
4. The possible right neighbor is `i`.
5. Check whether `value` is already inside the left interval.
6. Check whether `value` connects to the left interval.
7. Check whether `value` connects to the right interval.
8. Merge or insert as needed.

For `getIntervals()`:

Return the current intervals.

## Correctness

The data structure starts empty, so the interval list is initially sorted and disjoint.

Assume before adding a value, all intervals are sorted and disjoint.

Binary search finds where the value belongs by start position. Therefore, only the interval immediately before that position and the interval at that position can overlap with or touch the new value.

No earlier interval can touch the value because it ends before the left neighbor. No later interval can touch the value because it starts after the right neighbor.

The algorithm handles all valid possibilities:

If the value lies inside the left interval, the set of represented numbers stays the same.

If the value touches the left interval, the algorithm extends that interval.

If the value touches the right interval, the algorithm extends that interval.

If the value touches both, the algorithm merges both intervals through the new value.

If it touches neither, the algorithm inserts `[value, value]`.

After each case, the resulting intervals remain sorted and disjoint, and they represent exactly all numbers seen so far.

Therefore, every call to `getIntervals()` returns the correct summary.

## Complexity

Let `r` be the number of current intervals.

| Operation | Time | Space |
|---|---|---|
| `addNum` | `O(log r + r)` | `O(1)` extra |
| `getIntervals` | `O(r)` | `O(r)` if returning a copy |
| Storage | `O(r)` | Stores only intervals |

The binary search costs `O(log r)`.

Insertion or deletion in a Python list can cost `O(r)` because elements may shift.

In languages with a balanced tree map, `addNum` can be implemented in `O(log r)`.

## Implementation

```python
from bisect import bisect_left

class SummaryRanges:

    def __init__(self):
        self.intervals = []

    def addNum(self, value: int) -> None:
        intervals = self.intervals

        i = bisect_left(intervals, [value, value])

        left_touch = i > 0 and intervals[i - 1][1] + 1 >= value
        right_touch = i < len(intervals) and value + 1 >= intervals[i][0]

        if left_touch and right_touch:
            intervals[i - 1][1] = max(intervals[i - 1][1], intervals[i][1])
            intervals.pop(i)

        elif left_touch:
            intervals[i - 1][1] = max(intervals[i - 1][1], value)

        elif right_touch:
            intervals[i][0] = min(intervals[i][0], value)

        else:
            intervals.insert(i, [value, value])

    def getIntervals(self) -> list[list[int]]:
        return self.intervals
```

## Code Explanation

We keep intervals sorted by their start:

```python
self.intervals = []
```

For each new value, we find where `[value, value]` would be inserted:

```python
i = bisect_left(intervals, [value, value])
```

The left neighbor, if it exists, is:

```python
intervals[i - 1]
```

The right neighbor, if it exists, is:

```python
intervals[i]
```

The value touches or lies inside the left interval when:

```python
intervals[i - 1][1] + 1 >= value
```

This covers both cases:

```python
[1, 3] with value 4
[1, 5] with value 3
```

The value touches or lies before the right interval when:

```python
value + 1 >= intervals[i][0]
```

This covers:

```python
[5, 7] with value 4
```

If both sides touch, the value connects two intervals:

```python
[[1, 2], [4, 5]], value = 3
```

So we merge them:

```python
intervals[i - 1][1] = max(intervals[i - 1][1], intervals[i][1])
intervals.pop(i)
```

If only the left side touches, extend the left interval.

If only the right side touches, extend the right interval.

If neither side touches, insert a new single-value interval.

## Testing

```python
def run_tests():
    sr = SummaryRanges()

    assert sr.getIntervals() == []

    sr.addNum(1)
    assert sr.getIntervals() == [[1, 1]]

    sr.addNum(3)
    assert sr.getIntervals() == [[1, 1], [3, 3]]

    sr.addNum(7)
    assert sr.getIntervals() == [[1, 1], [3, 3], [7, 7]]

    sr.addNum(2)
    assert sr.getIntervals() == [[1, 3], [7, 7]]

    sr.addNum(6)
    assert sr.getIntervals() == [[1, 3], [6, 7]]

    sr.addNum(3)
    assert sr.getIntervals() == [[1, 3], [6, 7]]

    sr.addNum(4)
    assert sr.getIntervals() == [[1, 4], [6, 7]]

    sr.addNum(5)
    assert sr.getIntervals() == [[1, 7]]

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| Empty intervals | Checks initial state |
| Add isolated values | Creates separate intervals |
| Add bridge value | Merges two intervals |
| Add duplicate value | Should not change anything |
| Add value before interval | Extends right interval |
| Add final bridge | Merges everything into one interval |

