A detailed guide to solving Minimum Window Substring with a sliding window and frequency counters.
Problem Restatement
We are given two strings:
s: str
t: strWe need to return the shortest substring of s that contains every character from t.
Characters must be counted with duplicates.
So if:
t = "AABC"then a valid window must contain:
A: 2
B: 1
C: 1If no substring of s can contain all characters from t, return:
""The official problem states that s and t have lengths m and n, both up to 10^5, and the answer is unique if it exists.
Input and Output
| Item | Meaning |
|---|---|
| Input | Two strings s and t |
| Output | The minimum-length substring of s containing all characters from t |
| Valid window | A contiguous substring of s whose character counts cover t |
| Failure case | Return "" if no valid window exists |
Function shape:
class Solution:
def minWindow(self, s: str, t: str) -> str:
...Examples
Example 1:
s = "ADOBECODEBANC"
t = "ABC"We need a substring containing A, B, and C.
One valid substring is:
"ADOBEC"But it is not the shortest.
Later, we find:
"BANC"This contains B, A, and C, and no shorter valid substring exists.
So the answer is:
"BANC"Example 2:
s = "a"
t = "a"The whole string already contains the target character.
Answer:
"a"Example 3:
s = "a"
t = "aa"The target needs two a characters.
The string s only has one a.
Answer:
""First Thought: Brute Force
A direct solution is to try every substring of s.
For every left index l, we try every right index r, then check whether:
s[l:r + 1]contains all characters from t.
Code shape:
from collections import Counter
class Solution:
def minWindow(self, s: str, t: str) -> str:
need = Counter(t)
best = ""
for l in range(len(s)):
for r in range(l, len(s)):
window = Counter(s[l:r + 1])
valid = True
for ch in need:
if window[ch] < need[ch]:
valid = False
break
if valid:
candidate = s[l:r + 1]
if best == "" or len(candidate) < len(best):
best = candidate
return bestThis checks the right idea, but it is too slow.
Problem With Brute Force
There are O(m^2) substrings of s.
If we rebuild a counter for each substring, checking one substring can take O(m) time.
That can lead to:
O(m^3)Even if we optimize the counting, checking all substrings is still too much for m <= 10^5.
We need to reuse work between neighboring substrings.
Key Insight
The substring must be contiguous.
That points to a sliding window.
We keep a window:
s[left:right + 1]The window has two actions:
- Expand
rightto include more characters. - Move
leftforward to shrink the window.
The important rule:
When the window is missing required characters, expand it.
When the window is valid, shrink it as much as possible while keeping it valid.
This works because once a window becomes valid, adding more characters on the right keeps it valid, but may make it longer. So we should immediately try to remove useless characters from the left.
Counting What Is Still Missing
First, count the characters required by t.
need = Counter(t)Then maintain another counter for the current window.
window = {}We also maintain:
formedThis counts how many distinct required characters currently have enough frequency in the window.
For example:
t = "AABC"Then:
need = {
"A": 2,
"B": 1,
"C": 1,
}The number of distinct requirements is:
required = 3A window is valid when:
formed == requiredThat means every required character has enough count.
Algorithm
Create need = Counter(t).
Create an empty window counter.
Set:
required = len(need)
formed = 0
left = 0
best_len = infinity
best_left = 0Then move right from left to right through s.
For each character s[right]:
- Add it to
window. - If this character is required and its window count just reached the needed count, increase
formed. - While
formed == required, the current window is valid. - Record the window if it is smaller than the best answer.
- Remove
s[left]from the window and moveleftforward. - If removing
s[left]makes a required character insufficient, decreaseformed.
At the end, return the best window if one was found.
Correctness
The algorithm only records a window when formed == required.
That condition means every character required by t appears in the current window with at least the required frequency. Therefore every recorded window is valid.
Now consider the minimum valid window in the string. Let it be:
s[L:R + 1]When the right pointer reaches R, all characters from this target window are inside the current sliding window, unless the left pointer has already moved past L.
The left pointer only moves when the current window is valid. It stops moving once removing more characters would make the window invalid.
So when right == R, the algorithm shrinks the left side as far as it can while preserving validity. That process considers the shortest valid window ending at R.
Since the algorithm does this for every possible right, it considers the shortest valid window ending at every position. The global minimum among those recorded windows is therefore the minimum window substring.
Complexity
Let:
m = len(s)
n = len(t)| Metric | Value | Why |
|---|---|---|
| Time | O(m + n) | Build counts from t, then each pointer moves across s at most once |
| Space | O(n) | The counters store characters needed from t and characters seen in the window |
Since s and t contain uppercase and lowercase English letters, the practical character set is small. Still, O(n) is the safe general bound.
Implementation
from collections import Counter, defaultdict
class Solution:
def minWindow(self, s: str, t: str) -> str:
need = Counter(t)
window = defaultdict(int)
required = len(need)
formed = 0
left = 0
best_len = float("inf")
best_left = 0
for right, ch in enumerate(s):
window[ch] += 1
if ch in need and window[ch] == need[ch]:
formed += 1
while formed == required:
current_len = right - left + 1
if current_len < best_len:
best_len = current_len
best_left = left
left_ch = s[left]
window[left_ch] -= 1
if left_ch in need and window[left_ch] < need[left_ch]:
formed -= 1
left += 1
if best_len == float("inf"):
return ""
return s[best_left:best_left + best_len]Code Explanation
We count the target characters first:
need = Counter(t)For:
t = "ABC"we get:
{
"A": 1,
"B": 1,
"C": 1,
}For:
t = "AABC"we get:
{
"A": 2,
"B": 1,
"C": 1,
}Then we count the current window:
window = defaultdict(int)The variable required means how many distinct characters must be satisfied.
required = len(need)The variable formed means how many of those distinct characters are currently satisfied.
formed = 0When we add a character on the right:
window[ch] += 1we check whether this character has just reached its required count:
if ch in need and window[ch] == need[ch]:
formed += 1The equality check is important.
If need["A"] == 2, then the second A makes A satisfied.
The third A should not increase formed again.
When all required character counts are satisfied, the window is valid:
while formed == required:Inside this loop, we first record the current window if it is the best so far:
current_len = right - left + 1
if current_len < best_len:
best_len = current_len
best_left = leftThen we try to shrink the window from the left:
left_ch = s[left]
window[left_ch] -= 1If removing this character makes the count too small, the window becomes invalid:
if left_ch in need and window[left_ch] < need[left_ch]:
formed -= 1Then we move the left pointer:
left += 1At the end, if we never found a valid window, return an empty string:
if best_len == float("inf"):
return ""Otherwise, return the saved best substring:
return s[best_left:best_left + best_len]Testing
def run_tests():
s = Solution()
assert s.minWindow("ADOBECODEBANC", "ABC") == "BANC"
assert s.minWindow("a", "a") == "a"
assert s.minWindow("a", "aa") == ""
assert s.minWindow("AAABBC", "AABC") == "AABBC"
assert s.minWindow("ab", "b") == "b"
assert s.minWindow("bba", "ab") == "ba"
assert s.minWindow("abc", "d") == ""
assert s.minWindow("aaaaaaaaaaaabbbbbcdd", "abcdd") == "abbbbbcdd"
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"ADOBECODEBANC", "ABC" | Main example |
"a", "a" | Smallest valid input |
"a", "aa" | Duplicate requirement cannot be satisfied |
"AAABBC", "AABC" | Duplicate character requirement |
"ab", "b" | Answer at the end |
"bba", "ab" | Window found after skipping extra character |
"abc", "d" | No valid window |
| Long repeated string | Checks shrinking after many duplicate characters |