A clear explanation of Count Unique Characters of All Substrings using contribution counting with previous and next occurrences.
Problem Restatement
We are given a string s.
For any string t, define countUniqueChars(t) as the number of characters that appear exactly once in t.
We need to return the sum of countUniqueChars(t) over every substring t of s. The answer is guaranteed to fit in a 32-bit integer.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string s |
| Output | Sum of unique-character counts over all substrings |
| Unique character | A character that appears exactly once in a substring |
| Substring | A contiguous part of s |
Function shape:
class Solution:
def uniqueLetterString(self, s: str) -> int:
...Examples
Example 1:
s = "ABC"All substrings are:
| Substring | Unique characters | Count |
|---|---|---|
"A" | A | 1 |
"B" | B | 1 |
"C" | C | 1 |
"AB" | A, B | 2 |
"BC" | B, C | 2 |
"ABC" | A, B, C | 3 |
Total:
1 + 1 + 1 + 2 + 2 + 3 = 10So the answer is:
10Example 2:
s = "ABA"All substrings are:
| Substring | Unique characters | Count |
|---|---|---|
"A" | A | 1 |
"B" | B | 1 |
"A" | A | 1 |
"AB" | A, B | 2 |
"BA" | B, A | 2 |
"ABA" | B | 1 |
Total:
1 + 1 + 1 + 2 + 2 + 1 = 8So the answer is:
8First Thought: Brute Force
The direct approach is to enumerate every substring and count how many characters appear exactly once.
from collections import Counter
class Solution:
def uniqueLetterString(self, s: str) -> int:
n = len(s)
total = 0
for left in range(n):
for right in range(left, n):
freq = Counter(s[left:right + 1])
for count in freq.values():
if count == 1:
total += 1
return totalThis is correct, but it is far too slow.
There are O(n^2) substrings. Counting characters inside each substring can take O(n) time.
So the total time complexity is:
O(n^3)Key Insight
Instead of asking:
“How many unique characters does each substring have?”
ask:
“How many substrings count this specific character occurrence as unique?”
Take one occurrence of a character at index i.
This occurrence contributes 1 to a substring exactly when that substring contains s[i], but does not contain another copy of the same character.
So we need to know:
| Value | Meaning |
|---|---|
prev | Previous index with the same character |
next | Next index with the same character |
Then this occurrence is unique in every substring whose left boundary is after prev and at or before i, and whose right boundary is at or after i and before next.
Number of choices for the left boundary:
i - prevNumber of choices for the right boundary:
next - iSo the contribution of index i is:
(i - prev) * (next - i)Example of Contribution
Use:
s = "ABA"Index 0 has character "A".
There is no previous "A", so:
prev = -1The next "A" is at index 2, so:
next = 2Contribution:
(0 - (-1)) * (2 - 0) = 1 * 2 = 2This first "A" is unique in:
"A"
"AB"Index 1 has character "B".
There is no previous "B" and no next "B".
So:
prev = -1
next = 3Contribution:
(1 - (-1)) * (3 - 1) = 2 * 2 = 4This "B" is unique in:
"B"
"AB"
"BA"
"ABA"Index 2 has character "A".
Previous "A" is at index 0.
There is no next "A".
So:
prev = 0
next = 3Contribution:
(2 - 0) * (3 - 2) = 2 * 1 = 2Total:
2 + 4 + 2 = 8Algorithm
Group positions by character.
For each character, store every index where it appears.
Add two sentinels:
| Sentinel | Meaning |
|---|---|
-1 | Boundary before the string starts |
n | Boundary after the string ends |
Then for each real occurrence at position positions[i]:
prev = positions[i - 1]
curr = positions[i]
next = positions[i + 1]Add:
(curr - prev) * (next - curr)to the answer.
Correctness
Consider one occurrence of a character at index curr.
Let prev be the previous occurrence of the same character, or -1 if none exists.
Let next be the next occurrence of the same character, or n if none exists.
For this occurrence to be unique in a substring, the substring must include curr.
Its left boundary can be any index from prev + 1 to curr.
That gives:
curr - prevchoices.
Its right boundary can be any index from curr to next - 1.
That gives:
next - currchoices.
Each pair of valid boundaries creates one substring where this occurrence appears exactly once.
No other substring should count this occurrence, because extending left to include prev or extending right to include next would make the character appear at least twice.
So each occurrence contributes exactly:
(curr - prev) * (next - curr)The algorithm sums this exact contribution over all character occurrences. Therefore, it returns the total number of unique-character appearances across all substrings, which is exactly the required answer.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | Each index is stored once and processed once |
| Space | O(n) | Position lists store all indices |
Implementation
from collections import defaultdict
class Solution:
def uniqueLetterString(self, s: str) -> int:
n = len(s)
positions = defaultdict(list)
for i, ch in enumerate(s):
positions[ch].append(i)
total = 0
for indexes in positions.values():
indexes = [-1] + indexes + [n]
for i in range(1, len(indexes) - 1):
prev_index = indexes[i - 1]
curr_index = indexes[i]
next_index = indexes[i + 1]
total += (curr_index - prev_index) * (next_index - curr_index)
return totalCode Explanation
We first collect all positions of each character:
positions = defaultdict(list)
for i, ch in enumerate(s):
positions[ch].append(i)For example:
s = "ABA"gives:
{
"A": [0, 2],
"B": [1],
}Then for each character’s position list, we add boundary sentinels:
indexes = [-1] + indexes + [n]For "A" in "ABA":
[-1, 0, 2, 3]For each real occurrence, we compute how many substrings count it as unique:
total += (curr_index - prev_index) * (next_index - curr_index)This avoids building substrings entirely.
Testing
def run_tests():
s = Solution()
assert s.uniqueLetterString("ABC") == 10
assert s.uniqueLetterString("ABA") == 8
assert s.uniqueLetterString("LEETCODE") == 92
assert s.uniqueLetterString("A") == 1
assert s.uniqueLetterString("AA") == 2
assert s.uniqueLetterString("AAA") == 3
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
"ABC" | All characters are distinct |
"ABA" | Same character appears on both ends |
"LEETCODE" | Standard larger example |
"A" | Minimum non-empty input |
"AA" | Same character twice |
"AAA" | Repeated character with overlapping substrings |