A clear explanation of checking whether one string is a subsequence of another using two pointers.
Problem Restatement
We are given two strings:
s
tWe need to return True if s is a subsequence of t.
A subsequence is formed by deleting zero or more characters from a string without changing the order of the remaining characters.
For example:
s = "abc"
t = "ahbgdc"The characters a, b, and c appear in t in the same order.
So s is a subsequence of t.
But:
s = "axc"
t = "ahbgdc"The character x does not appear in t, so the answer is False.
Input and Output
| Item | Meaning |
|---|---|
| Input | Two strings s and t |
| Output | True if s is a subsequence of t, otherwise False |
| Constraint | 0 <= s.length <= 100 |
| Constraint | 0 <= t.length <= 10^4 |
| Constraint | Both strings contain only lowercase English letters |
Example function shape:
def isSubsequence(s: str, t: str) -> bool:
...Examples
Example 1:
s = "abc"
t = "ahbgdc"We can match:
a -> b -> cinside t in order.
So the answer is:
TrueExample 2:
s = "axc"
t = "ahbgdc"We can match a.
But there is no x after that.
So the answer is:
FalseExample 3:
s = ""
t = "abc"The empty string is a subsequence of every string.
So the answer is:
TrueFirst Thought: Search Each Character From the Start
A simple but inefficient approach is to search each character of s inside t.
For example, to check "abc" inside "ahbgdc", we search for a, then b, then c.
The problem is that each search must begin after the previous matched position. If we restart from the beginning every time, we may match characters in the wrong order.
So we need to remember where we are in both strings.
That leads directly to two pointers.
Key Insight
Use one pointer for s and one pointer for t.
The pointer for s tells us which character we are trying to match next.
The pointer for t scans through the larger string.
When the characters match, we move the s pointer forward.
The t pointer always moves forward.
If we finish all characters in s, then s is a subsequence of t.
Algorithm
Initialize:
i = 0where i points to the next character of s that we need to match.
Then scan every character ch in t.
For each ch:
- If
iis still insidesands[i] == ch, moveiforward. - If
i == len(s), all characters were matched, so returnTrue.
After scanning t, return whether all characters of s were matched.
Correctness
The algorithm scans t from left to right.
The pointer i always represents the first unmatched character in s.
When t contains a character equal to s[i], the algorithm uses that character as the next match and moves i forward. This preserves order because t is scanned only forward.
When the characters do not match, skipping the current character of t is safe, because it cannot help match the current needed character.
If i == len(s), every character in s has been matched in order inside t, so s is a subsequence.
If the scan finishes while i < len(s), then some character of s could not be matched after the previous matches. Therefore s is not a subsequence.
Complexity
Let m = len(s) and n = len(t).
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | We scan t once |
| Space | O(1) | We use only one pointer |
Implementation
class Solution:
def isSubsequence(self, s: str, t: str) -> bool:
i = 0
for ch in t:
if i < len(s) and s[i] == ch:
i += 1
if i == len(s):
return True
return i == len(s)Code Explanation
We start with the first character of s:
i = 0Then scan t:
for ch in t:If the current character of t matches the next needed character of s, consume that character:
if i < len(s) and s[i] == ch:
i += 1When all characters of s are consumed:
if i == len(s):
return TrueIf the loop ends, return whether all characters were matched:
return i == len(s)This also handles the empty string case correctly, because i == len(s) when s is empty.
Follow-Up: Many Queries Against the Same t
The follow-up asks what to do if there are many strings s1, s2, ..., sk, and we need to check each one against the same t.
In that case, scanning all of t for every query can be expensive.
A better approach is to preprocess t.
Store the positions of each character.
Example:
t = "ahbgdc"Positions:
{
"a": [0],
"b": [2],
"c": [5],
"d": [4],
"g": [3],
"h": [1],
}For each query s, keep the last matched index in t.
For every character in s, use binary search to find the first occurrence after the last matched index.
If every character can be found, s is a subsequence.
This makes each query roughly:
O(len(s) * log n)after preprocessing.
Testing
def test_solution():
sol = Solution()
assert sol.isSubsequence("abc", "ahbgdc") is True
assert sol.isSubsequence("axc", "ahbgdc") is False
assert sol.isSubsequence("", "abc") is True
assert sol.isSubsequence("abc", "") is False
assert sol.isSubsequence("", "") is True
assert sol.isSubsequence("ace", "abcde") is True
assert sol.isSubsequence("aec", "abcde") is False
assert sol.isSubsequence("aaaa", "aa") is False
assert sol.isSubsequence("aaa", "aaaaa") is True
print("all tests passed")
test_solution()Test meaning:
| Test | Why |
|---|---|
"abc", "ahbgdc" | Basic positive case |
"axc", "ahbgdc" | Missing required character |
Empty s | Empty string is always a subsequence |
Empty t | Non-empty s cannot be matched |
| Both empty | Empty string matches empty string |
"ace" | Match with skipped characters |
"aec" | Same letters but wrong order |
| Repeated characters | Confirms enough copies are required |