A clear explanation of counting distinct subsequences using dynamic programming.
Problem Restatement
We are given two strings:
s
tWe need to count how many distinct subsequences of s equal t.
A subsequence is formed by deleting zero or more characters without changing the order of the remaining characters.
The official problem asks for the number of distinct subsequences of s equal to t. (leetcode.com)
For example:
s = "rabbbit"
t = "rabbit"The answer is:
3because there are three different ways to delete one 'b' from "rabbbit" to obtain "rabbit".
Input and Output
| Item | Meaning |
|---|---|
| Input | Two strings s and t |
| Output | Number of distinct subsequences of s equal to t |
| Subsequence rule | Character order must stay the same |
| Deletion | Characters may be removed |
| Reordering | Not allowed |
The function shape is:
class Solution:
def numDistinct(self, s: str, t: str) -> int:
...Examples
Consider:
s = "rabbbit"
t = "rabbit"We need to remove exactly one 'b'.
Possible choices:
ra[b]bbit
rab[b]bit
rabb[b]itEach produces:
rabbitSo the answer is:
3Now consider:
s = "babgbag"
t = "bag"Possible subsequences:
ba[g]
ba[g]bag
[b]abgag
...The answer is:
5For:
s = "abc"
t = "abcd"the answer is:
0because we cannot build a longer string from a shorter one.
First Thought: Try Every Subsequence
One brute force approach is:
- Generate every subsequence of
s. - Count how many equal
t.
But a string of length n has:
possible subsequences.
That becomes impossible for large inputs.
We need a more structured way to count possibilities.
Key Insight
At each character in s, we have two choices:
- Skip the character.
- Use the character if it matches the current character in
t.
Suppose:
s = "babgbag"
t = "bag"If we are comparing:
s[i] = 'b'
t[j] = 'b'then:
- We may use this
'b'to matcht[j]. - Or skip it and try another
'b'later.
This naturally creates overlapping subproblems.
Dynamic programming helps us reuse those results.
Dynamic Programming Definition
Define:
dp[i][j]as:
The number of ways to form
t[:j]usings[:i].
That means:
- First
icharacters ofs - First
jcharacters oft
Base Cases
An empty target string can always be formed:
dp[i][0] = 1for every i.
There is exactly one way:
delete everythingA non-empty target cannot be formed from an empty source:
dp[0][j] = 0for every positive j.
Transition
If:
s[i - 1] == t[j - 1]then we have two choices:
- Use this matching character.
- Skip this character.
So:
If the characters do not match:
s[i - 1] != t[j - 1]then we can only skip the character from s:
Algorithm
Let:
m = len(s)
n = len(t)Create a DP table of size:
(m + 1) x (n + 1)Initialize:
dp[i][0] = 1Then fill the table row by row.
For each pair (i, j):
- If characters match:
- Count ways using the character.
- Count ways skipping the character.
- Otherwise:
- Only skip the character.
The final answer is:
dp[m][n]Correctness
The DP state dp[i][j] represents the number of ways to form the first j characters of t using the first i characters of s.
If the current characters match:
s[i - 1] == t[j - 1]then every valid subsequence falls into one of two categories:
- Subsequences that use this matching character.
- Subsequences that skip it.
The number of subsequences that use the character is:
dp[i - 1][j - 1]because the earlier parts must form t[:j-1].
The number of subsequences that skip the character is:
dp[i - 1][j]because we still need to form the same target prefix.
Adding these counts gives the total number of valid subsequences.
If the characters do not match, the current character in s cannot help form t[j-1]. Therefore, the only option is to skip it, giving:
dp[i - 1][j]The base cases are also correct:
- An empty target has exactly one subsequence.
- A non-empty target cannot be formed from an empty source.
Thus, the DP table correctly counts all distinct subsequences.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(mn) | Every DP state is computed once |
| Space | O(mn) | The DP table stores all states |
Here:
m = len(s)n = len(t)
Implementation
class Solution:
def numDistinct(self, s: str, t: str) -> int:
m = len(s)
n = len(t)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = 1
for i in range(1, m + 1):
for j in range(1, n + 1):
if s[i - 1] == t[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j]
else:
dp[i][j] = dp[i - 1][j]
return dp[m][n]Code Explanation
Get the string lengths:
m = len(s)
n = len(t)Create the DP table:
dp = [[0] * (n + 1) for _ in range(m + 1)]Initialize the empty-target base case:
for i in range(m + 1):
dp[i][0] = 1Now fill the table:
for i in range(1, m + 1):
for j in range(1, n + 1):If the characters match:
if s[i - 1] == t[j - 1]:then either use or skip the character:
dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j]Otherwise, skip the character:
dp[i][j] = dp[i - 1][j]Finally:
return dp[m][n]Testing
class Solution:
def numDistinct(self, s: str, t: str) -> int:
m = len(s)
n = len(t)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = 1
for i in range(1, m + 1):
for j in range(1, n + 1):
if s[i - 1] == t[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + dp[i - 1][j]
else:
dp[i][j] = dp[i - 1][j]
return dp[m][n]
def run_tests():
s = Solution()
assert s.numDistinct("rabbbit", "rabbit") == 3
assert s.numDistinct("babgbag", "bag") == 5
assert s.numDistinct("abc", "abcd") == 0
assert s.numDistinct("abc", "") == 1
assert s.numDistinct("", "a") == 0
assert s.numDistinct("aaaaa", "aa") == 10
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"rabbbit" -> "rabbit" | Standard example |
"babgbag" -> "bag" | Multiple overlapping subsequences |
| Short source, longer target | Impossible case |
| Empty target | Exactly one subsequence |
| Empty source, non-empty target | Impossible case |
| Repeated characters | Confirms combinational counting |