# LeetCode 395: Longest Substring with At Least K Repeating Characters

## Problem Restatement

We are given a string `s` and an integer `k`.

We need to return the length of the longest substring where every character in that substring appears at least `k` times.

A substring must be contiguous.

If no valid substring exists, return `0`.

The string contains only lowercase English letters. The length of `s` is at most `10^4`, and `k` can be larger than the string length.

## Input and Output

| Item | Meaning |
|---|---|
| Input | A string `s` and an integer `k` |
| Output | Length of the longest valid substring |
| Valid substring | Every character in it appears at least `k` times |
| Constraint | `1 <= s.length <= 10^4` |
| Constraint | `s` contains only lowercase English letters |
| Constraint | `1 <= k <= 10^5` |

Example function shape:

```python
def longestSubstring(s: str, k: int) -> int:
    ...
```

## Examples

Example 1:

```python
s = "aaabb"
k = 3
```

The substring:

```python
"aaa"
```

is valid because `a` appears `3` times.

The full string `"aaabb"` is not valid because `b` appears only `2` times.

So the answer is:

```python
3
```

Example 2:

```python
s = "ababbc"
k = 2
```

The substring:

```python
"ababb"
```

has:

| Character | Count |
|---|---|
| `a` | `2` |
| `b` | `3` |

Both counts are at least `2`.

So the answer is:

```python
5
```

## First Thought: Check Every Substring

A direct solution is to try every substring.

For each start index and end index, count the characters inside that substring.

If every character count is at least `k`, update the answer.

This is correct, but too slow.

There are `O(n^2)` substrings, and checking each substring may cost up to `O(n)`.

That gives `O(n^3)` time in the simplest form.

We need to use the frequency condition more directly.

## Key Insight

If a character appears fewer than `k` times in a substring, then any larger valid substring cannot include that character.

For example:

```python
s = "aaabb"
k = 3
```

In the whole string, `b` appears only `2` times.

So no valid substring that contains `b` can exist, because even the whole string does not have enough `b`.

Therefore, `b` acts as a separator.

We can split around bad characters and solve the smaller pieces.

For `"aaabb"`:

```text
aaa | bb
```

Only `"aaa"` can be valid.

This is the divide-and-conquer idea.

## Algorithm

Define a helper function:

```python
solve(left, right)
```

It returns the answer for the substring:

```python
s[left:right]
```

For each call:

1. Count character frequencies inside `s[left:right]`.
2. Find any character whose frequency is greater than `0` but less than `k`.
3. Such a character cannot be part of any valid substring in this range.
4. Split the range around that character.
5. Recursively solve each segment.
6. Return the maximum answer among those segments.
7. If no bad character exists, the whole range is valid, so return `right - left`.

## Correctness

Consider one recursive range `s[left:right]`.

If every character appearing in this range appears at least `k` times, then the whole substring is valid. Returning its length is correct.

Otherwise, suppose character `c` appears in this range fewer than `k` times.

Any substring inside `s[left:right]` that contains `c` also contains at most that many copies of `c`. Since that count is less than `k`, such a substring cannot be valid.

Therefore every valid substring in this range must lie completely between occurrences of `c`.

The algorithm splits around `c` and recursively checks exactly those candidate segments.

By applying the same reasoning to each segment, the recursion never discards a possible valid answer and never accepts an invalid substring.

So the maximum returned by the recursive calls is exactly the length of the longest valid substring.

## Complexity

Let `n = len(s)`.

| Metric | Value | Why |
|---|---|---|
| Time | `O(26 * n)` average in practice | Each level counts lowercase letters over active segments |
| Worst case | `O(n^2)` | Repeated unbalanced splits can rescan large ranges |
| Space | `O(n)` | Recursion stack in the worst case |

Since the alphabet has only 26 lowercase letters, this approach is usually fast enough for the given constraints.

## Implementation

```python
class Solution:
    def longestSubstring(self, s: str, k: int) -> int:
        def solve(left: int, right: int) -> int:
            if right - left < k:
                return 0

            count = [0] * 26

            for i in range(left, right):
                idx = ord(s[i]) - ord("a")
                count[idx] += 1

            for i in range(left, right):
                idx = ord(s[i]) - ord("a")

                if 0 < count[idx] < k:
                    best = 0
                    start = left

                    while start < right:
                        while start < right and s[start] == s[i]:
                            start += 1

                        end = start

                        while end < right and s[end] != s[i]:
                            end += 1

                        best = max(best, solve(start, end))
                        start = end

                    return best

            return right - left

        return solve(0, len(s))
```

## Code Explanation

The helper solves one half-open range:

```python
def solve(left: int, right: int) -> int:
```

If the range length is smaller than `k`, it cannot contain any valid non-empty substring:

```python
if right - left < k:
    return 0
```

Then we count the characters in this range:

```python
count = [0] * 26

for i in range(left, right):
    idx = ord(s[i]) - ord("a")
    count[idx] += 1
```

Next, we search for a bad character:

```python
if 0 < count[idx] < k:
```

A bad character appears, but not enough times.

When we find one, we split the current range around that character.

The loop skips separators:

```python
while start < right and s[start] == s[i]:
    start += 1
```

Then it finds the next segment without that separator:

```python
while end < right and s[end] != s[i]:
    end += 1
```

Each segment is solved recursively:

```python
best = max(best, solve(start, end))
```

If no bad character exists, the whole range is valid:

```python
return right - left
```

## Alternative: Sliding Window Over Unique Counts

There is also an iterative `O(26 * n)` solution.

The idea is to try each possible number of distinct characters from `1` to `26`.

For each target number of distinct characters, use a sliding window and track:

| Variable | Meaning |
|---|---|
| `unique` | Number of distinct characters in the window |
| `at_least_k` | Number of distinct characters whose count is at least `k` |

When:

```python
unique == target_unique and unique == at_least_k
```

the current window is valid.

```python
class Solution:
    def longestSubstring(self, s: str, k: int) -> int:
        best = 0
        n = len(s)

        for target_unique in range(1, 27):
            count = [0] * 26
            left = 0
            unique = 0
            at_least_k = 0

            for right in range(n):
                idx = ord(s[right]) - ord("a")

                if count[idx] == 0:
                    unique += 1

                count[idx] += 1

                if count[idx] == k:
                    at_least_k += 1

                while unique > target_unique:
                    left_idx = ord(s[left]) - ord("a")

                    if count[left_idx] == k:
                        at_least_k -= 1

                    count[left_idx] -= 1

                    if count[left_idx] == 0:
                        unique -= 1

                    left += 1

                if unique == target_unique and unique == at_least_k:
                    best = max(best, right - left + 1)

        return best
```

## Testing

```python
def test_solution():
    s = Solution()

    assert s.longestSubstring("aaabb", 3) == 3
    assert s.longestSubstring("ababbc", 2) == 5
    assert s.longestSubstring("ababacb", 3) == 0
    assert s.longestSubstring("aaaaa", 1) == 5
    assert s.longestSubstring("aaaaa", 6) == 0
    assert s.longestSubstring("bbaaacbd", 3) == 3
    assert s.longestSubstring("weitong", 2) == 0

    print("all tests passed")

test_solution()
```

Test meaning:

| Test | Why |
|---|---|
| `"aaabb"`, `k = 3` | Basic split around under-repeated character |
| `"ababbc"`, `k = 2` | Main example |
| `"ababacb"`, `k = 3` | No valid substring |
| `"aaaaa"`, `k = 1` | Whole string is valid |
| `"aaaaa"`, `k = 6` | `k` larger than length |
| `"bbaaacbd"`, `k = 3` | Valid substring in the middle |
| `"weitong"`, `k = 2` | All characters appear once |

