A clear explanation of Permutation in String using a fixed-size sliding window and character frequency counts.
Problem Restatement
We are given two strings s1 and s2.
Return true if s2 contains a permutation of s1 as a substring. Otherwise, return false.
A permutation means the same characters with the same frequencies, but possibly in a different order.
So the problem is asking:
Does s2 contain any contiguous substring with exactly the same character counts as s1?
The official constraints are 1 <= s1.length, s2.length <= 10^4, and both strings consist of lowercase English letters.
Input and Output
| Item | Meaning |
|---|---|
| Input | Two strings s1 and s2 |
| Output | Boolean |
Return true when | Some substring of s2 is a permutation of s1 |
Return false when | No such substring exists |
| Character set | Lowercase English letters |
Example function shape:
def checkInclusion(s1: str, s2: str) -> bool:
...Examples
Example 1:
s1 = "ab"
s2 = "eidbaooo"The substring:
"ba"appears in s2.
It has the same characters as "ab".
So the answer is:
TrueExample 2:
s1 = "ab"
s2 = "eidboaoo"The substrings of length 2 are:
"ei", "id", "db", "bo", "oa", "ao", "oo"None of them has one "a" and one "b".
So the answer is:
FalseFirst Thought: Check Every Substring
A direct solution is:
- Generate every substring of
s2with lengthlen(s1). - Sort that substring.
- Sort
s1. - Compare them.
If the sorted strings are equal, the substring is a permutation of s1.
This works, but sorting every window is wasteful.
If m = len(s1) and n = len(s2), there are about n - m + 1 windows, and sorting each window costs O(m log m).
We can do better by maintaining character counts while the window slides.
Key Insight
A substring is a permutation of s1 exactly when it has the same frequency count for every character.
Since the strings contain only lowercase English letters, we can use arrays of length 26.
For example:
s1 = "ab"has:
count['a'] = 1
count['b'] = 1The substring "ba" has the same counts, so it is a valid permutation.
We only need to check windows of length len(s1) in s2.
Sliding Window Idea
Let:
m = len(s1)We keep a window of exactly m characters in s2.
As the window moves one step to the right:
- Add the new character entering the window.
- Remove the old character leaving the window.
- Compare the window frequency with the frequency of
s1.
If the two frequency arrays are equal, we found a permutation.
Algorithm
- If
len(s1) > len(s2), returnFalse. - Build a frequency array for
s1. - Build a frequency array for the first
len(s1)characters ofs2. - If the two arrays are equal, return
True. - Slide the window through the rest of
s2:- Add the incoming character.
- Remove the outgoing character.
- If the arrays are equal, return
True.
- Return
False.
Correctness
A substring of s2 can be a permutation of s1 only if it has the same length as s1.
The algorithm checks every substring of s2 with exactly that length by sliding a fixed-size window from left to right.
For each window, the algorithm maintains a frequency array that records exactly how many times each lowercase letter appears in that window.
The frequency array for s1 records exactly how many times each lowercase letter must appear.
If the two arrays are equal, then every character appears the same number of times in the current window as in s1. Therefore, the current window is a permutation of s1, so returning True is correct.
If the algorithm finishes without finding equal frequency arrays, then no length-len(s1) substring of s2 has the same character counts as s1. Therefore, no permutation of s1 appears in s2, so returning False is correct.
Complexity
Let n = len(s2).
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | The window slides across s2 once, and each comparison checks 26 counts |
| Space | O(1) | The frequency arrays always have size 26 |
The factor 26 is constant.
Implementation
class Solution:
def checkInclusion(self, s1: str, s2: str) -> bool:
m = len(s1)
n = len(s2)
if m > n:
return False
need = [0] * 26
window = [0] * 26
for i in range(m):
need[ord(s1[i]) - ord("a")] += 1
window[ord(s2[i]) - ord("a")] += 1
if need == window:
return True
for right in range(m, n):
incoming = ord(s2[right]) - ord("a")
outgoing = ord(s2[right - m]) - ord("a")
window[incoming] += 1
window[outgoing] -= 1
if need == window:
return True
return FalseCode Explanation
We first handle the impossible case:
if m > n:
return FalseA longer string cannot appear as a permutation inside a shorter string.
Then we create two count arrays:
need = [0] * 26
window = [0] * 26need stores the character counts of s1.
window stores the character counts of the current window in s2.
We initialize both using the first m characters:
for i in range(m):
need[ord(s1[i]) - ord("a")] += 1
window[ord(s2[i]) - ord("a")] += 1Then we compare the initial window:
if need == window:
return TrueAfter that, we slide the window one character at a time.
The new character enters:
window[incoming] += 1The old character leaves:
window[outgoing] -= 1After each move, we compare the two arrays again.
Optimized Match Counter Version
The previous solution compares two arrays of length 26 at every step. That is already O(n) because 26 is constant.
We can also keep a matches counter to avoid repeated full-array comparisons.
class Solution:
def checkInclusion(self, s1: str, s2: str) -> bool:
m = len(s1)
n = len(s2)
if m > n:
return False
need = [0] * 26
window = [0] * 26
for i in range(m):
need[ord(s1[i]) - ord("a")] += 1
window[ord(s2[i]) - ord("a")] += 1
matches = 0
for i in range(26):
if need[i] == window[i]:
matches += 1
if matches == 26:
return True
for right in range(m, n):
incoming = ord(s2[right]) - ord("a")
outgoing = ord(s2[right - m]) - ord("a")
if need[incoming] == window[incoming]:
matches -= 1
window[incoming] += 1
if need[incoming] == window[incoming]:
matches += 1
if need[outgoing] == window[outgoing]:
matches -= 1
window[outgoing] -= 1
if need[outgoing] == window[outgoing]:
matches += 1
if matches == 26:
return True
return FalseThis version uses the same idea but tracks how many character counts currently match.
Testing
def run_tests():
s = Solution()
assert s.checkInclusion("ab", "eidbaooo") is True
assert s.checkInclusion("ab", "eidboaoo") is False
assert s.checkInclusion("adc", "dcda") is True
assert s.checkInclusion("a", "a") is True
assert s.checkInclusion("a", "b") is False
assert s.checkInclusion("hello", "ooolleoooleh") is False
assert s.checkInclusion("abc", "bbbca") is True
print("all tests passed")
run_tests()| Test | Why |
|---|---|
"ab", "eidbaooo" | Official sample with a valid permutation |
"ab", "eidboaoo" | Official sample with no valid permutation |
"adc", "dcda" | Match appears near the end |
"a", "a" | Single-character match |
"a", "b" | Single-character mismatch |
"hello", "ooolleoooleh" | Similar letters but no full permutation |
"abc", "bbbca" | Permutation appears as "bca" |