Skip to content

LeetCode 159: Longest Substring with At Most Two Distinct Characters

A clear explanation of finding the longest substring with at most two distinct characters using a sliding window.

Problem Restatement

We are given a string s.

We need to return the length of the longest substring that contains at most two distinct characters.

A substring must be contiguous.

For example:

s = "eceba"

The substring:

"ece"

contains only two distinct characters:

"e", "c"

Its length is:

3

So the answer is:

3

LeetCode gives examples such as s = "eceba" with output 3, and s = "ccaabbb" with output 5. The constraints state that 1 <= s.length <= 10^5 and s consists of English letters.

Input and Output

ItemMeaning
InputA string s
OutputLength of the longest valid substring
Valid substringContains at most two distinct characters
Substring ruleCharacters must be contiguous

Example function shape:

def lengthOfLongestSubstringTwoDistinct(s: str) -> int:
    ...

Examples

Example 1:

s = "eceba"

Valid substrings with at most two distinct characters include:

"e"
"ec"
"ece"

The longest one is:

"ece"

So the answer is:

3

Example 2:

s = "ccaabbb"

The longest valid substring is:

"aabbb"

It contains only:

"a", "b"

So the answer is:

5

Example 3:

s = "aaaa"

The whole string contains only one distinct character.

So the answer is:

4

First Thought: Brute Force

The simplest solution is to check every substring.

For each starting index i, expand the ending index j.

Track the set of distinct characters. If the set size becomes greater than 2, stop expanding.

class Solution:
    def lengthOfLongestSubstringTwoDistinct(self, s: str) -> int:
        n = len(s)
        best = 0

        for i in range(n):
            seen = set()

            for j in range(i, n):
                seen.add(s[j])

                if len(seen) > 2:
                    break

                best = max(best, j - i + 1)

        return best

This works, but it can be too slow.

Problem With Brute Force

For each starting position, we may scan many characters to the right.

In the worst case, this gives O(n^2) time.

Since s.length can be as large as 10^5, quadratic time is too large.

We need to reuse work between overlapping substrings.

Key Insight

This is a sliding window problem.

We keep a window:

s[left : right + 1]

The window should always contain at most two distinct characters.

We expand the window by moving right.

If the window becomes invalid, meaning it has more than two distinct characters, we shrink it by moving left.

To know how many distinct characters are in the window, we store character counts in a hash map.

Algorithm

Initialize:

left = 0
count = {}
answer = 0

Then scan the string with right.

For each character s[right]:

  1. Add it to the frequency map.
  2. While the map has more than two keys, shrink from the left.
  3. Update the best length using the current valid window.

The current window length is:

right - left + 1

Walkthrough

Use:

s = "eceba"

Start:

left = 0
count = {}
answer = 0

Read e:

window = "e"
count = {"e": 1}
answer = 1

Read c:

window = "ec"
count = {"e": 1, "c": 1}
answer = 2

Read e:

window = "ece"
count = {"e": 2, "c": 1}
answer = 3

Read b:

window = "eceb"
count = {"e": 2, "c": 1, "b": 1}

Now the window has three distinct characters, so shrink from the left.

Remove the first e:

window = "ceb"
count = {"e": 1, "c": 1, "b": 1}

Still three distinct characters.

Remove c:

window = "eb"
count = {"e": 1, "b": 1}

Now the window is valid again.

Read a:

window = "eba"
count = {"e": 1, "b": 1, "a": 1}

Shrink until only two distinct characters remain.

The best length remains:

3

Correctness

The algorithm maintains this invariant:

At the end of each loop iteration, the window s[left : right + 1] contains at most two distinct characters.

When a new character is added, the window may become invalid. The inner loop moves left to the right and decreases character counts until only two distinct characters remain. Therefore, after shrinking, the invariant is restored.

For each fixed right, after the shrinking step, left is the smallest valid left boundary for a window ending at right. That means right - left + 1 is the longest valid substring ending at right.

The algorithm checks this best valid window for every possible right.

Since every substring ends at some index right, taking the maximum over all iterations gives the length of the longest substring with at most two distinct characters.

Complexity

Let n be the length of s.

MetricValueWhy
TimeO(n)Each character enters and leaves the window at most once
SpaceO(1)At most three character counts are stored before shrinking

Although the hash map is used, the problem limits the window to at most two distinct characters after shrinking.

Implementation

class Solution:
    def lengthOfLongestSubstringTwoDistinct(self, s: str) -> int:
        left = 0
        count = {}
        answer = 0

        for right, ch in enumerate(s):
            count[ch] = count.get(ch, 0) + 1

            while len(count) > 2:
                left_ch = s[left]
                count[left_ch] -= 1

                if count[left_ch] == 0:
                    del count[left_ch]

                left += 1

            answer = max(answer, right - left + 1)

        return answer

Code Explanation

We start the window at index 0:

left = 0

The dictionary stores how many times each character appears in the current window:

count = {}

We expand the window one character at a time:

for right, ch in enumerate(s):

Add the new character:

count[ch] = count.get(ch, 0) + 1

If there are more than two distinct characters:

while len(count) > 2:

remove characters from the left side of the window:

left_ch = s[left]
count[left_ch] -= 1

If a character count becomes zero, remove it from the map:

if count[left_ch] == 0:
    del count[left_ch]

Then move the left pointer:

left += 1

Now the window is valid again, so update the answer:

answer = max(answer, right - left + 1)

Testing

def run_tests():
    sol = Solution()

    assert sol.lengthOfLongestSubstringTwoDistinct("eceba") == 3
    assert sol.lengthOfLongestSubstringTwoDistinct("ccaabbb") == 5
    assert sol.lengthOfLongestSubstringTwoDistinct("aaaa") == 4
    assert sol.lengthOfLongestSubstringTwoDistinct("ab") == 2
    assert sol.lengthOfLongestSubstringTwoDistinct("abc") == 2
    assert sol.lengthOfLongestSubstringTwoDistinct("abaccc") == 4
    assert sol.lengthOfLongestSubstringTwoDistinct("aabbcc") == 4

    print("all tests passed")

run_tests()
TestWhy
"eceba"Standard example
"ccaabbb"Longest answer appears near the end
"aaaa"Only one distinct character
"ab"Entire string is valid
"abc"Must shrink after third character
"abaccc"Best window after shrinking
"aabbcc"Multiple possible windows