LeetCode 467: Unique Substrings in Wraparound String

Problem Restatement

We are given a string s.

There is also an infinite wraparound base string:

"...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd..."

We need to count how many different non-empty substrings of s also appear in this infinite base string. The official problem asks for the number of unique non-empty substrings of s that are present in base.

A substring appears in the base string if its characters are consecutive in alphabet order, with wraparound from z to a.

For example:

"abc"   # valid
"zab"   # valid
"za"    # valid
"ac"    # invalid
"ba"    # invalid

Input and Output

Item	Meaning
Input	A lowercase English string `s`
Output	Number of unique non-empty substrings of `s` that appear in the wraparound base
Valid sequence	Consecutive alphabet characters
Wraparound	`z` can be followed by `a`

Function shape:

def findSubstringInWraproundString(s: str) -> int:
    ...

Examples

Example 1:

s = "a"

The valid unique substrings are:

"a"

Answer:

Example 2:

s = "cac"

Substrings are:

"c"
"a"
"c"
"ca"
"ac"
"cac"

Only "c" and "a" appear in the wraparound base.

"ca" is invalid because a does not come after c.

"ac" is invalid because c does not come immediately after a.

Answer:

Example 3:

s = "zab"

Valid unique substrings are:

"z"
"a"
"b"
"za"
"ab"
"zab"

Answer:

First Thought: Generate All Substrings

A direct idea is to generate every substring of s.

For each substring, check whether it follows the wraparound rule.

Use a set to avoid counting duplicates.

class Solution:
    def findSubstringInWraproundString(self, s: str) -> int:
        seen = set()
        n = len(s)

        for left in range(n):
            for right in range(left, n):
                if right > left:
                    prev = s[right - 1]
                    curr = s[right]

                    if (ord(curr) - ord(prev)) % 26 != 1:
                        break

                seen.add(s[left:right + 1])

        return len(seen)

This works, but it creates many substrings.

Problem With Brute Force

A string of length n has:

n * (n + 1) / 2

substrings.

Creating and storing them can cost O(n^2) or worse because substring creation also takes time.

We need to count unique valid substrings without storing every substring.

Key Insight

Every valid substring in the wraparound base is determined by two things:

Its ending character.
Its length.

For example, valid substrings ending with c are:

"c"
"bc"
"abc"
"zabc"

If the longest valid substring ending with c has length 4, then it automatically includes exactly 4 unique valid substrings ending with c.

So we store:

best[ch] = longest valid wraparound substring ending with ch

Then the answer is:

sum(best.values())

Why this avoids duplicates:

If we have already seen a valid substring ending with c of length 3, and later see another valid substring ending with c of length 2, that shorter one is already included.

Only the maximum length for each ending character matters.

Consecutive Character Rule

Two adjacent characters continue a valid wraparound sequence if:

(ord(curr) - ord(prev)) % 26 == 1

Examples:

"a" -> "b": (98 - 97) % 26 = 1
"b" -> "c": (99 - 98) % 26 = 1
"z" -> "a": (97 - 122) % 26 = 1

The modulo handles the wraparound from z to a.

Algorithm

Initialize an array of length 26:

best = [0] * 26

Also keep:

length = 0

where length is the current length of the valid consecutive run ending at the current character.

Scan the string from left to right.

For each index i:

If i > 0 and s[i] follows s[i - 1] in wraparound order, increment length.
Otherwise, reset length = 1.
Let idx be the alphabet index of s[i].
Update:

best[idx] = max(best[idx], length)

Return:

sum(best)

Walkthrough

Use:

s = "zab"

Start:

best = [0] * 26
length = 0

At z:

length = 1
best["z"] = 1

This counts:

"z"

At a:

"z" -> "a" is valid
length = 2
best["a"] = 2

This counts substrings ending with a:

"a"
"za"

At b:

"a" -> "b" is valid
length = 3
best["b"] = 3

This counts substrings ending with b:

"b"
"ab"
"zab"

Total:

best["z"] + best["a"] + best["b"] = 1 + 2 + 3 = 6

Correctness

At each position, length is the length of the longest valid wraparound substring ending at that position.

If the current character follows the previous character in wraparound order, every valid substring ending at the previous character can be extended by the current character, so length increases by one.

If it does not follow, no longer valid run can continue through this position, so the longest valid substring ending here has length 1.

For a fixed ending character ch, if the longest valid substring ending with ch has length L, then all shorter valid suffixes ending with ch also exist. These are exactly L unique substrings ending with ch.

Keeping only the maximum L for each ending character counts each unique valid substring once and avoids duplicates.

Therefore, summing the maximum lengths over all ending characters gives the number of unique non-empty substrings of s that appear in the wraparound base.

Complexity

Let n = len(s).

Metric	Value	Why
Time	`O(n)`	We scan the string once
Space	`O(1)`	The array has fixed size `26`

Implementation

class Solution:
    def findSubstringInWraproundString(self, s: str) -> int:
        best = [0] * 26
        length = 0

        for i, ch in enumerate(s):
            if i > 0 and (ord(ch) - ord(s[i - 1])) % 26 == 1:
                length += 1
            else:
                length = 1

            idx = ord(ch) - ord("a")
            best[idx] = max(best[idx], length)

        return sum(best)

Code Explanation

We use:

best = [0] * 26

best[0] is for substrings ending with a.

best[1] is for substrings ending with b.

And so on.

The variable:

length = 0

tracks the current valid consecutive run length.

The condition:

(ord(ch) - ord(s[i - 1])) % 26 == 1

checks whether the current character follows the previous character in the infinite wraparound alphabet.

If yes:

length += 1

Otherwise:

length = 1

because every single character is valid by itself.

Then:

best[idx] = max(best[idx], length)

records the longest valid substring ending with this character.

Finally:

return sum(best)

adds the number of unique valid substrings for all possible ending characters.

Testing

def run_tests():
    s = Solution()

    assert s.findSubstringInWraproundString("a") == 1
    assert s.findSubstringInWraproundString("cac") == 2
    assert s.findSubstringInWraproundString("zab") == 6
    assert s.findSubstringInWraproundString("abcdefghijklmnopqrstuvwxyz") == 351
    assert s.findSubstringInWraproundString("za") == 3
    assert s.findSubstringInWraproundString("zzz") == 1
    assert s.findSubstringInWraproundString("abcabc") == 6

    print("all tests passed")

run_tests()

Test meaning:

Test	Why
`"a"`	Single valid substring
`"cac"`	Only single-character substrings count
`"zab"`	Wraparound from `z` to `a`
Alphabet string	Long straight consecutive run
`"za"`	Small wraparound case
`"zzz"`	Duplicate single-character substring only
`"abcabc"`	Repeated valid substrings should not be double-counted