A clear explanation of counting unique substrings that appear in the infinite alphabet wraparound string using dynamic programming by ending character.
Problem Restatement
We are given a string s.
There is also an infinite wraparound base string:
"...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd..."We need to count how many different non-empty substrings of s also appear in this infinite base string. The official problem asks for the number of unique non-empty substrings of s that are present in base.
A substring appears in the base string if its characters are consecutive in alphabet order, with wraparound from z to a.
For example:
"abc" # valid
"zab" # valid
"za" # valid
"ac" # invalid
"ba" # invalidInput and Output
| Item | Meaning |
|---|---|
| Input | A lowercase English string s |
| Output | Number of unique non-empty substrings of s that appear in the wraparound base |
| Valid sequence | Consecutive alphabet characters |
| Wraparound | z can be followed by a |
Function shape:
def findSubstringInWraproundString(s: str) -> int:
...Examples
Example 1:
s = "a"The valid unique substrings are:
"a"Answer:
1Example 2:
s = "cac"Substrings are:
"c"
"a"
"c"
"ca"
"ac"
"cac"Only "c" and "a" appear in the wraparound base.
"ca" is invalid because a does not come after c.
"ac" is invalid because c does not come immediately after a.
Answer:
2Example 3:
s = "zab"Valid unique substrings are:
"z"
"a"
"b"
"za"
"ab"
"zab"Answer:
6First Thought: Generate All Substrings
A direct idea is to generate every substring of s.
For each substring, check whether it follows the wraparound rule.
Use a set to avoid counting duplicates.
class Solution:
def findSubstringInWraproundString(self, s: str) -> int:
seen = set()
n = len(s)
for left in range(n):
for right in range(left, n):
if right > left:
prev = s[right - 1]
curr = s[right]
if (ord(curr) - ord(prev)) % 26 != 1:
break
seen.add(s[left:right + 1])
return len(seen)This works, but it creates many substrings.
Problem With Brute Force
A string of length n has:
n * (n + 1) / 2substrings.
Creating and storing them can cost O(n^2) or worse because substring creation also takes time.
We need to count unique valid substrings without storing every substring.
Key Insight
Every valid substring in the wraparound base is determined by two things:
- Its ending character.
- Its length.
For example, valid substrings ending with c are:
"c"
"bc"
"abc"
"zabc"If the longest valid substring ending with c has length 4, then it automatically includes exactly 4 unique valid substrings ending with c.
So we store:
best[ch] = longest valid wraparound substring ending with chThen the answer is:
sum(best.values())Why this avoids duplicates:
If we have already seen a valid substring ending with c of length 3, and later see another valid substring ending with c of length 2, that shorter one is already included.
Only the maximum length for each ending character matters.
Consecutive Character Rule
Two adjacent characters continue a valid wraparound sequence if:
(ord(curr) - ord(prev)) % 26 == 1Examples:
"a" -> "b": (98 - 97) % 26 = 1
"b" -> "c": (99 - 98) % 26 = 1
"z" -> "a": (97 - 122) % 26 = 1The modulo handles the wraparound from z to a.
Algorithm
Initialize an array of length 26:
best = [0] * 26Also keep:
length = 0where length is the current length of the valid consecutive run ending at the current character.
Scan the string from left to right.
For each index i:
- If
i > 0ands[i]followss[i - 1]in wraparound order, incrementlength. - Otherwise, reset
length = 1. - Let
idxbe the alphabet index ofs[i]. - Update:
best[idx] = max(best[idx], length)Return:
sum(best)Walkthrough
Use:
s = "zab"Start:
best = [0] * 26
length = 0At z:
length = 1
best["z"] = 1This counts:
"z"At a:
"z" -> "a" is valid
length = 2
best["a"] = 2This counts substrings ending with a:
"a"
"za"At b:
"a" -> "b" is valid
length = 3
best["b"] = 3This counts substrings ending with b:
"b"
"ab"
"zab"Total:
best["z"] + best["a"] + best["b"] = 1 + 2 + 3 = 6Correctness
At each position, length is the length of the longest valid wraparound substring ending at that position.
If the current character follows the previous character in wraparound order, every valid substring ending at the previous character can be extended by the current character, so length increases by one.
If it does not follow, no longer valid run can continue through this position, so the longest valid substring ending here has length 1.
For a fixed ending character ch, if the longest valid substring ending with ch has length L, then all shorter valid suffixes ending with ch also exist. These are exactly L unique substrings ending with ch.
Keeping only the maximum L for each ending character counts each unique valid substring once and avoids duplicates.
Therefore, summing the maximum lengths over all ending characters gives the number of unique non-empty substrings of s that appear in the wraparound base.
Complexity
Let n = len(s).
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | We scan the string once |
| Space | O(1) | The array has fixed size 26 |
Implementation
class Solution:
def findSubstringInWraproundString(self, s: str) -> int:
best = [0] * 26
length = 0
for i, ch in enumerate(s):
if i > 0 and (ord(ch) - ord(s[i - 1])) % 26 == 1:
length += 1
else:
length = 1
idx = ord(ch) - ord("a")
best[idx] = max(best[idx], length)
return sum(best)Code Explanation
We use:
best = [0] * 26best[0] is for substrings ending with a.
best[1] is for substrings ending with b.
And so on.
The variable:
length = 0tracks the current valid consecutive run length.
The condition:
(ord(ch) - ord(s[i - 1])) % 26 == 1checks whether the current character follows the previous character in the infinite wraparound alphabet.
If yes:
length += 1Otherwise:
length = 1because every single character is valid by itself.
Then:
best[idx] = max(best[idx], length)records the longest valid substring ending with this character.
Finally:
return sum(best)adds the number of unique valid substrings for all possible ending characters.
Testing
def run_tests():
s = Solution()
assert s.findSubstringInWraproundString("a") == 1
assert s.findSubstringInWraproundString("cac") == 2
assert s.findSubstringInWraproundString("zab") == 6
assert s.findSubstringInWraproundString("abcdefghijklmnopqrstuvwxyz") == 351
assert s.findSubstringInWraproundString("za") == 3
assert s.findSubstringInWraproundString("zzz") == 1
assert s.findSubstringInWraproundString("abcabc") == 6
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"a" | Single valid substring |
"cac" | Only single-character substrings count |
"zab" | Wraparound from z to a |
| Alphabet string | Long straight consecutive run |
"za" | Small wraparound case |
"zzz" | Duplicate single-character substring only |
"abcabc" | Repeated valid substrings should not be double-counted |