A clear explanation of validating a word abbreviation using two pointers and number parsing.
Problem Restatement
We are given a word and an abbreviation.
The abbreviation may contain lowercase letters and digits.
A digit sequence means we skip that many characters in the original word.
We must return whether the abbreviation correctly represents the word.
Numbers cannot have leading zeroes. For example, "01" is invalid.
The official problem describes abbreviations as replacing non-adjacent, non-empty substrings with their lengths. It gives examples such as "s10n" and "sub4u4" as valid abbreviations of "substitution", while "s010n" and "s0ubstitution" are invalid because of leading zero and zero-length replacement rules.
Input and Output
| Item | Meaning |
|---|---|
| Input | A string word and a string abbr |
| Output | True if abbr is valid for word, otherwise False |
| Letters | Must match the current character in word |
| Numbers | Skip that many characters in word |
| Leading zero | Invalid |
Example function shape:
def validWordAbbreviation(word: str, abbr: str) -> bool:
...Examples
Example 1:
word = "internationalization"
abbr = "i12iz4n"The answer is:
TrueExplanation:
i 12 iz 4 nmeans:
match "i"
skip 12 characters
match "i"
match "z"
skip 4 characters
match "n"This consumes the whole word correctly.
Example 2:
word = "apple"
abbr = "a2e"The answer is:
FalseExplanation:
a 2 emeans:
match "a"
skip 2 characters
match "e"After matching "a" and skipping "pp", the next character in the word is "l", not "e".
So the abbreviation is invalid.
First Thought: Expand the Abbreviation
One possible approach is to expand the abbreviation into a pattern and compare it with the word.
For example:
abbr = "i12iz4n"could mean:
i + 12 skipped characters + iz + 4 skipped characters + nBut we do not know the skipped characters from the abbreviation alone.
We only need to verify the abbreviation against the original word, so expansion is unnecessary.
Key Insight
Use two pointers.
One pointer walks through word.
The other pointer walks through abbr.
Let:
i = index in word
j = index in abbrWhen abbr[j] is a letter, it must match word[i].
When abbr[j] is a digit, parse the whole number and move i forward by that amount.
This directly simulates what the abbreviation means.
Algorithm
Initialize:
i = 0
j = 0Then process abbr from left to right.
While both pointers are within range:
- If
abbr[j]is a digit:- If it is
"0", returnFalse. - Parse the full number.
- Move
iforward by that number.
- If it is
- Otherwise:
- Check that
iis still insideword. - Check that
word[i] == abbr[j]. - Move both pointers forward.
- Check that
At the end, both strings must be fully consumed:
i == len(word) and j == len(abbr)Correctness
The algorithm maintains this invariant:
After processing abbr[:j], the pointer i is exactly the number of characters consumed from word.
If abbr[j] is a letter, a valid abbreviation must match that exact character in word. The algorithm checks this and advances both pointers by one.
If abbr[j] is a number, a valid abbreviation must skip exactly that many characters in word. The algorithm parses the complete number and advances i by that amount.
A leading zero is rejected immediately because abbreviation numbers cannot have leading zeroes.
When the loop ends, the abbreviation is valid only if both word and abbr are fully consumed. If either pointer has leftover characters, then the abbreviation used too few or too many characters.
Therefore, the algorithm returns True exactly when abbr is a valid abbreviation of word.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(m + n) | Each character in word and abbr is processed at most once |
| Space | O(1) | Only pointers and a number accumulator are used |
Here:
| Symbol | Meaning |
|---|---|
m | Length of word |
n | Length of abbr |
Implementation
class Solution:
def validWordAbbreviation(self, word: str, abbr: str) -> bool:
i = 0
j = 0
while i < len(word) and j < len(abbr):
if abbr[j].isdigit():
if abbr[j] == "0":
return False
value = 0
while j < len(abbr) and abbr[j].isdigit():
value = value * 10 + int(abbr[j])
j += 1
i += value
else:
if word[i] != abbr[j]:
return False
i += 1
j += 1
return i == len(word) and j == len(abbr)Code Explanation
We start with two pointers:
i = 0
j = 0The pointer i tracks where we are in word.
The pointer j tracks where we are in abbr.
The main loop runs while both strings still have characters available:
while i < len(word) and j < len(abbr):If the abbreviation character is a digit:
if abbr[j].isdigit():we first reject leading zero:
if abbr[j] == "0":
return FalseThen we parse the full number:
value = 0
while j < len(abbr) and abbr[j].isdigit():
value = value * 10 + int(abbr[j])
j += 1After parsing the number, we skip that many characters in word:
i += valueIf the abbreviation character is a letter, it must match the current word character:
if word[i] != abbr[j]:
return FalseThen both pointers move forward:
i += 1
j += 1At the end, both strings must be fully consumed:
return i == len(word) and j == len(abbr)Testing
def test_valid_word_abbreviation():
s = Solution()
assert s.validWordAbbreviation("internationalization", "i12iz4n") is True
assert s.validWordAbbreviation("apple", "a2e") is False
assert s.validWordAbbreviation("substitution", "s10n") is True
assert s.validWordAbbreviation("substitution", "sub4u4") is True
assert s.validWordAbbreviation("substitution", "12") is True
assert s.validWordAbbreviation("substitution", "substitution") is True
assert s.validWordAbbreviation("substitution", "s010n") is False
assert s.validWordAbbreviation("substitution", "s0ubstitution") is False
assert s.validWordAbbreviation("word", "4") is True
assert s.validWordAbbreviation("word", "5") is False
print("all tests passed")Test Notes
| Test | Why |
|---|---|
"internationalization", "i12iz4n" | Standard valid example |
"apple", "a2e" | Standard invalid example |
"substitution", "s10n" | Large middle skip |
"substitution", "sub4u4" | Multiple skips and letters |
"substitution", "12" | Whole word skipped |
"substitution", "substitution" | No abbreviation |
"s010n" | Leading zero is invalid |
"s0ubstitution" | Zero-length replacement is invalid |
"word", "4" | Exact full skip |
"word", "5" | Skip goes past the word |