# LeetCode 340: Longest Substring with At Most K Distinct Characters

## Problem Restatement

We are given a string `s` and an integer `k`.

Return the length of the longest substring of `s` that contains at most `k` distinct characters.

A substring must be contiguous.

For example:

```text
s = "eceba"
k = 2
```

The substring `"ece"` contains only two distinct characters:

```text
e, c
```

Its length is `3`, so the answer is `3`.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A string `s` and an integer `k` |
| Output | Length of the longest valid substring |
| Valid substring | Contains at most `k` distinct characters |
| Substring rule | Characters must be contiguous |

Function shape:

```python
def lengthOfLongestSubstringKDistinct(s: str, k: int) -> int:
    ...
```

## Examples

Example 1:

```text
Input: s = "eceba", k = 2
Output: 3
```

The longest valid substring is:

```text
"ece"
```

It contains two distinct characters and has length `3`.

Example 2:

```text
Input: s = "aa", k = 1
Output: 2
```

The whole string contains only one distinct character, so the answer is `2`.

Example 3:

```text
Input: s = "a", k = 0
Output: 0
```

No non-empty substring can contain at most zero distinct characters.

So the answer is `0`.

## First Thought: Try Every Substring

A direct method is to enumerate every substring.

For each starting index `i`, extend the ending index `j`, track distinct characters, and update the answer when the substring has at most `k` distinct characters.

```python
class Solution:
    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
        ans = 0

        for i in range(len(s)):
            seen = set()

            for j in range(i, len(s)):
                seen.add(s[j])

                if len(seen) <= k:
                    ans = max(ans, j - i + 1)
                else:
                    break

        return ans
```

This works, but in the worst case it still checks many substrings.

The time complexity is:

```text
O(n^2)
```

We need a linear scan.

## Key Insight

Use a sliding window.

Maintain a window:

```text
s[left : right + 1]
```

that contains at most `k` distinct characters.

Move `right` forward to include new characters.

If the window becomes invalid, meaning it has more than `k` distinct characters, move `left` forward until the window is valid again.

A hash map stores character counts inside the current window.

The number of keys in the hash map is the number of distinct characters.

## Algorithm

Use:

| Variable | Meaning |
|---|---|
| `left` | Left boundary of the window |
| `right` | Right boundary of the window |
| `count` | Character frequency map inside the window |
| `best` | Longest valid window length seen so far |

Steps:

1. If `k == 0`, return `0`.
2. Set `left = 0`.
3. Scan `right` from `0` to `len(s) - 1`.
4. Add `s[right]` to the frequency map.
5. While the map has more than `k` distinct characters:
   1. Remove `s[left]` from the map.
   2. If its count becomes zero, delete it.
   3. Move `left` right by one.
6. Update the answer with the current window length.

## Walkthrough

Use:

```text
s = "eceba"
k = 2
```

Start:

```text
left = 0
count = {}
best = 0
```

Read `e`:

```text
window = "e"
count = {e: 1}
best = 1
```

Read `c`:

```text
window = "ec"
count = {e: 1, c: 1}
best = 2
```

Read `e`:

```text
window = "ece"
count = {e: 2, c: 1}
best = 3
```

Read `b`:

```text
window = "eceb"
count = {e: 2, c: 1, b: 1}
```

Now there are three distinct characters, which is more than `k = 2`.

Shrink from the left.

Remove `e`:

```text
window = "ceb"
count = {e: 1, c: 1, b: 1}
```

Still three distinct characters.

Remove `c`:

```text
window = "eb"
count = {e: 1, b: 1}
```

Now valid again.

Read `a`:

```text
window = "eba"
count = {e: 1, b: 1, a: 1}
```

Shrink again until valid:

```text
window = "ba"
count = {b: 1, a: 1}
```

The best length found is `3`.

## Correctness

The algorithm maintains this invariant after the inner `while` loop:

```text
s[left : right + 1] contains at most k distinct characters.
```

When a new character is added at `right`, the window may become invalid.

The only way to make it valid again is to remove characters from the left, because `right` is fixed for this step and the substring must remain contiguous.

The algorithm keeps moving `left` until the number of distinct characters is at most `k`.

At that point, the current window is the longest valid substring ending at `right`, because moving `left` any farther right would only make the window shorter.

The algorithm checks every possible right endpoint and records the maximum valid window length. Therefore, the final answer is the length of the longest substring with at most `k` distinct characters.

## Complexity

| Metric | Value | Why |
|---|---|---|
| Time | `O(n)` | Each character enters and leaves the window at most once |
| Space | `O(k)` | The map stores at most `k + 1` distinct characters during shrinking |

Here, `n` is the length of `s`.

## Implementation

```python
from collections import defaultdict

class Solution:
    def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
        if k == 0:
            return 0

        count = defaultdict(int)
        left = 0
        best = 0

        for right, ch in enumerate(s):
            count[ch] += 1

            while len(count) > k:
                left_ch = s[left]
                count[left_ch] -= 1

                if count[left_ch] == 0:
                    del count[left_ch]

                left += 1

            best = max(best, right - left + 1)

        return best
```

## Code Explanation

Handle the zero case first:

```python
if k == 0:
    return 0
```

No non-empty substring can have at most zero distinct characters.

The frequency map stores characters in the current window:

```python
count = defaultdict(int)
```

The left boundary starts at `0`:

```python
left = 0
```

For every right boundary:

```python
for right, ch in enumerate(s):
```

add the new character:

```python
count[ch] += 1
```

If the window has too many distinct characters, shrink it:

```python
while len(count) > k:
```

Remove the leftmost character from the window:

```python
left_ch = s[left]
count[left_ch] -= 1
```

If its count reaches zero, delete it from the map:

```python
if count[left_ch] == 0:
    del count[left_ch]
```

Then move the left boundary:

```python
left += 1
```

After the window is valid, update the best length:

```python
best = max(best, right - left + 1)
```

## Testing

```python
def run_tests():
    s = Solution()

    assert s.lengthOfLongestSubstringKDistinct("eceba", 2) == 3
    assert s.lengthOfLongestSubstringKDistinct("aa", 1) == 2
    assert s.lengthOfLongestSubstringKDistinct("a", 0) == 0

    assert s.lengthOfLongestSubstringKDistinct("abc", 2) == 2
    assert s.lengthOfLongestSubstringKDistinct("abaccc", 2) == 4
    assert s.lengthOfLongestSubstringKDistinct("aaaa", 1) == 4
    assert s.lengthOfLongestSubstringKDistinct("abcadcacacaca", 3) == 11

    print("all tests passed")

run_tests()
```

Test meaning:

| Test | Why |
|---|---|
| `"eceba"`, `2` | Standard sample |
| `"aa"`, `1` | Entire string is valid |
| `"a"`, `0` | Zero distinct characters allowed |
| `"abc"`, `2` | Needs shrinking |
| `"abaccc"`, `2` | Longest window appears after shrink |
| `"aaaa"`, `1` | Repeated single character |
| `"abcadcacacaca"`, `3` | Larger mixed window |

