# LeetCode 992: Subarrays with K Different Integers

## Problem Restatement

We are given an integer array `nums` and an integer `k`.

A good subarray is a contiguous subarray that contains exactly `k` different integers.

We need to return the number of good subarrays.

For example:

```python
nums = [1, 2, 1, 2, 3]
k = 2
```

The good subarrays are:

```python
[1, 2]
[2, 1]
[1, 2]
[2, 3]
[1, 2, 1]
[2, 1, 2]
[1, 2, 1, 2]
```

So the answer is `7`.

The official constraints are `1 <= nums.length <= 2 * 10^4` and `1 <= nums[i], k <= nums.length`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | Integer array `nums` and integer `k` |
| Output | Number of contiguous subarrays with exactly `k` distinct integers |
| Subarray | Must be contiguous |
| Distinct count | Count unique values inside the subarray |

Function shape:

```python
def subarraysWithKDistinct(nums: list[int], k: int) -> int:
    ...
```

## Examples

Example 1:

```python
nums = [1, 2, 1, 2, 3]
k = 2
```

The subarrays with exactly `2` distinct integers are:

```python
[1, 2]
[2, 1]
[1, 2]
[2, 3]
[1, 2, 1]
[2, 1, 2]
[1, 2, 1, 2]
```

Answer:

```python
7
```

Example 2:

```python
nums = [1, 2, 1, 3, 4]
k = 3
```

The good subarrays are:

```python
[1, 2, 1, 3]
[2, 1, 3]
[1, 3, 4]
```

Answer:

```python
3
```

## First Thought: Check Every Subarray

The direct solution is to enumerate every subarray.

For each starting index, extend the subarray to the right and keep a set of distinct values.

Whenever the set size is exactly `k`, increment the answer.

```python
class Solution:
    def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
        n = len(nums)
        answer = 0

        for left in range(n):
            distinct = set()

            for right in range(left, n):
                distinct.add(nums[right])

                if len(distinct) == k:
                    answer += 1
                elif len(distinct) > k:
                    break

        return answer
```

This works, but it can still be too slow.

## Problem With Brute Force

There are `O(n^2)` subarrays.

Since `nums.length` can be as large as `2 * 10^4`, checking all subarrays is too expensive.

We need a linear-time method.

## Key Insight

Counting subarrays with exactly `k` distinct integers is hard directly.

Counting subarrays with at most `k` distinct integers is much easier with a sliding window.

The key identity is:

```python
exactly(k) = at_most(k) - at_most(k - 1)
```

Why this works:

| Count | Includes subarrays with |
|---|---|
| `at_most(k)` | `1, 2, ..., k` distinct values |
| `at_most(k - 1)` | `1, 2, ..., k - 1` distinct values |

Subtracting removes all subarrays with fewer than `k` distinct values.

Only subarrays with exactly `k` distinct values remain.

## Counting Subarrays With At Most K Distinct Values

Use a sliding window `[left, right]`.

Maintain a frequency map for values inside the window.

When adding `nums[right]` makes the number of distinct values greater than `k`, move `left` until the window has at most `k` distinct values again.

For each `right`, once the window is valid, every subarray ending at `right` and starting between `left` and `right` is valid.

The number of such subarrays is:

```python
right - left + 1
```

## Algorithm

Define a helper:

```python
at_most(k)
```

It returns the number of subarrays with at most `k` distinct values.

Inside `at_most(k)`:

1. Initialize `left = 0`, `answer = 0`, and an empty frequency map.
2. Expand `right` from left to right.
3. Add `nums[right]` to the frequency map.
4. If the number of distinct values becomes greater than `k`, move `left` until valid again.
5. Add `right - left + 1` to `answer`.

Then return:

```python
at_most(k) - at_most(k - 1)
```

## Correctness

For `at_most(k)`, the sliding window maintains the invariant that the current window has at most `k` distinct values after shrinking.

For a fixed `right`, once the window `[left, right]` is valid, every subarray ending at `right` and starting at any index from `left` to `right` is also valid. Removing elements from the left cannot increase the number of distinct values.

There are exactly `right - left + 1` such subarrays, so the helper counts all valid subarrays ending at each `right`.

Every subarray has exactly one ending index, so `at_most(k)` counts every subarray with at most `k` distinct values exactly once.

The final answer subtracts the number of subarrays with at most `k - 1` distinct values from the number of subarrays with at most `k` distinct values. This leaves exactly the subarrays with `k` distinct values.

Therefore, the algorithm returns the correct count.

## Complexity

Let `n = len(nums)`.

| Metric | Value | Why |
|---|---:|---|
| Time | `O(n)` | Each `at_most` call moves each pointer at most `n` times |
| Space | `O(n)` | The frequency map can store up to `n` distinct values |

The helper runs twice, so the time is still linear.

## Implementation

```python
from collections import defaultdict

class Solution:
    def subarraysWithKDistinct(self, nums: list[int], k: int) -> int:
        def at_most(limit: int) -> int:
            if limit == 0:
                return 0

            freq = defaultdict(int)
            left = 0
            answer = 0
            distinct = 0

            for right, value in enumerate(nums):
                if freq[value] == 0:
                    distinct += 1

                freq[value] += 1

                while distinct > limit:
                    left_value = nums[left]
                    freq[left_value] -= 1

                    if freq[left_value] == 0:
                        distinct -= 1

                    left += 1

                answer += right - left + 1

            return answer

        return at_most(k) - at_most(k - 1)
```

## Code Explanation

The helper counts subarrays with at most `limit` distinct values:

```python
def at_most(limit: int) -> int:
```

When `limit == 0`, no non-empty subarray can be valid:

```python
if limit == 0:
    return 0
```

We use a frequency map:

```python
freq = defaultdict(int)
```

The variable `distinct` stores how many values currently have positive frequency in the window:

```python
distinct = 0
```

When a new value enters the window for the first time, increase `distinct`:

```python
if freq[value] == 0:
    distinct += 1

freq[value] += 1
```

If the window has too many distinct values, shrink from the left:

```python
while distinct > limit:
```

When a value's frequency becomes zero, it leaves the window completely:

```python
if freq[left_value] == 0:
    distinct -= 1
```

After the window is valid, all subarrays ending at `right` and starting from `left` through `right` are valid:

```python
answer += right - left + 1
```

Finally:

```python
return at_most(k) - at_most(k - 1)
```

keeps only subarrays with exactly `k` distinct values.

## Testing

```python
def run_tests():
    s = Solution()

    assert s.subarraysWithKDistinct([1, 2, 1, 2, 3], 2) == 7
    assert s.subarraysWithKDistinct([1, 2, 1, 3, 4], 3) == 3
    assert s.subarraysWithKDistinct([1, 2, 2], 1) == 4
    assert s.subarraysWithKDistinct([1, 1, 1], 1) == 6
    assert s.subarraysWithKDistinct([1, 2, 3], 3) == 1

    print("all tests passed")

run_tests()
```

| Test | Expected | Why |
|---|---:|---|
| `[1,2,1,2,3]`, `2` | `7` | Standard sample |
| `[1,2,1,3,4]`, `3` | `3` | Standard sample |
| `[1,2,2]`, `1` | `4` | Counts repeated-value subarrays |
| `[1,1,1]`, `1` | `6` | Every subarray is valid |
| `[1,2,3]`, `3` | `1` | Only the full array has three distinct values |