A clear explanation of finding the first non-repeating character in a string using character frequency counting.
Problem Restatement
We are given a string s.
We need to find the first character that appears exactly once in the string and return its index.
If every character appears more than once, return -1.
The string contains only lowercase English letters, and its length can be up to 10^5.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string s |
| Output | Index of the first non-repeating character |
Return -1 | When no unique character exists |
| Constraint | 1 <= s.length <= 10^5 |
| Constraint | s contains only lowercase English letters |
Example function shape:
def firstUniqChar(s: str) -> int:
...Examples
Example 1:
s = "leetcode"Character counts:
| Character | Count |
|---|---|
l | 1 |
e | 3 |
t | 1 |
c | 1 |
o | 1 |
d | 1 |
The first character with count 1 is l at index 0.
So the answer is:
0Example 2:
s = "loveleetcode"The first unique character is v.
It appears at index 2.
So the answer is:
2Example 3:
s = "aabb"Every character repeats.
So the answer is:
-1First Thought: Check Each Character Against the Whole String
A direct method is to examine each character and count how many times it appears.
For each index i, scan the whole string and count s[i].
If the count is 1, return i.
Code shape:
class Solution:
def firstUniqChar(self, s: str) -> int:
for i in range(len(s)):
count = 0
for ch in s:
if ch == s[i]:
count += 1
if count == 1:
return i
return -1This is correct, but it is too slow.
For a string of length n, this scans the string once for every character.
That gives O(n^2) time.
With n up to 10^5, this is not acceptable.
Key Insight
We need two pieces of information:
- How many times each character appears.
- Which unique character appears first in the original order.
Counting first solves the first part.
Scanning again from left to right solves the second part.
Because the string contains only lowercase English letters, we can use a fixed array of length 26.
Algorithm
Create an array:
count = [0] * 26First pass:
- Scan every character in
s. - Convert the character to an index from
0to25. - Increment its frequency.
Second pass:
- Scan
sfrom left to right. - For each character, check its frequency.
- Return the first index whose character count is
1.
If no such index exists, return -1.
Correctness
After the first pass, count stores the exact number of times each lowercase letter appears in s.
During the second pass, we inspect characters in increasing index order.
If count[idx] == 1, then the current character appears exactly once in the whole string. Since we scan from left to right, this is the first such character, so returning its index is correct.
If count[idx] > 1, the current character repeats, so it cannot be the answer.
If the second pass finishes without finding a character with count 1, then no character appears exactly once. Returning -1 is correct.
Complexity
Let n = len(s).
| Metric | Value | Why |
|---|---|---|
| Time | O(n) | We scan the string twice |
| Space | O(1) | The count array always has length 26 |
Implementation
class Solution:
def firstUniqChar(self, s: str) -> int:
count = [0] * 26
for ch in s:
idx = ord(ch) - ord("a")
count[idx] += 1
for i, ch in enumerate(s):
idx = ord(ch) - ord("a")
if count[idx] == 1:
return i
return -1Code Explanation
Create a fixed frequency table:
count = [0] * 26Count every character:
for ch in s:
idx = ord(ch) - ord("a")
count[idx] += 1The expression:
ord(ch) - ord("a")maps lowercase letters to array indices:
'a' -> 0
'b' -> 1
...
'z' -> 25Then scan the string again:
for i, ch in enumerate(s):The first time we find a character with count 1, return its index:
if count[idx] == 1:
return iIf there is no unique character:
return -1Testing
def test_solution():
s = Solution()
assert s.firstUniqChar("leetcode") == 0
assert s.firstUniqChar("loveleetcode") == 2
assert s.firstUniqChar("aabb") == -1
assert s.firstUniqChar("z") == 0
assert s.firstUniqChar("zz") == -1
assert s.firstUniqChar("aabbc") == 4
assert s.firstUniqChar("aabbcd") == 4
print("all tests passed")
test_solution()Test meaning:
| Test | Why |
|---|---|
"leetcode" | First character is unique |
"loveleetcode" | First two characters repeat, answer is later |
"aabb" | No unique character |
"z" | Single-character string |
"zz" | One repeated character |
"aabbc" | Unique character at the end |
"aabbcd" | Multiple unique characters, return the first one |