LeetCode 408: Valid Word Abbreviation

Problem Restatement

We are given a word and an abbreviation.

The abbreviation may contain lowercase letters and digits.

A digit sequence means we skip that many characters in the original word.

We must return whether the abbreviation correctly represents the word.

Numbers cannot have leading zeroes. For example, "01" is invalid.

The official problem describes abbreviations as replacing non-adjacent, non-empty substrings with their lengths. It gives examples such as "s10n" and "sub4u4" as valid abbreviations of "substitution", while "s010n" and "s0ubstitution" are invalid because of leading zero and zero-length replacement rules.

Input and Output

Item	Meaning
Input	A string `word` and a string `abbr`
Output	`True` if `abbr` is valid for `word`, otherwise `False`
Letters	Must match the current character in `word`
Numbers	Skip that many characters in `word`
Leading zero	Invalid

Example function shape:

def validWordAbbreviation(word: str, abbr: str) -> bool:
    ...

Examples

Example 1:

word = "internationalization"
abbr = "i12iz4n"

The answer is:

True

Explanation:

i 12 iz 4 n

means:

match "i"
skip 12 characters
match "i"
match "z"
skip 4 characters
match "n"

This consumes the whole word correctly.

Example 2:

word = "apple"
abbr = "a2e"

The answer is:

False

Explanation:

a 2 e

means:

match "a"
skip 2 characters
match "e"

After matching "a" and skipping "pp", the next character in the word is "l", not "e".

So the abbreviation is invalid.

First Thought: Expand the Abbreviation

One possible approach is to expand the abbreviation into a pattern and compare it with the word.

For example:

abbr = "i12iz4n"

could mean:

i + 12 skipped characters + iz + 4 skipped characters + n

But we do not know the skipped characters from the abbreviation alone.

We only need to verify the abbreviation against the original word, so expansion is unnecessary.

Key Insight

Use two pointers.

One pointer walks through word.

The other pointer walks through abbr.

Let:

i = index in word
j = index in abbr

When abbr[j] is a letter, it must match word[i].

When abbr[j] is a digit, parse the whole number and move i forward by that amount.

This directly simulates what the abbreviation means.

Algorithm

Initialize:

i = 0
j = 0

Then process abbr from left to right.

While both pointers are within range:

If abbr[j] is a digit:
- If it is "0", return False.
- Parse the full number.
- Move i forward by that number.
Otherwise:
- Check that i is still inside word.
- Check that word[i] == abbr[j].
- Move both pointers forward.

At the end, both strings must be fully consumed:

i == len(word) and j == len(abbr)

Correctness

The algorithm maintains this invariant:

After processing abbr[:j], the pointer i is exactly the number of characters consumed from word.

If abbr[j] is a letter, a valid abbreviation must match that exact character in word. The algorithm checks this and advances both pointers by one.

If abbr[j] is a number, a valid abbreviation must skip exactly that many characters in word. The algorithm parses the complete number and advances i by that amount.

A leading zero is rejected immediately because abbreviation numbers cannot have leading zeroes.

When the loop ends, the abbreviation is valid only if both word and abbr are fully consumed. If either pointer has leftover characters, then the abbreviation used too few or too many characters.

Therefore, the algorithm returns True exactly when abbr is a valid abbreviation of word.

Complexity

Metric	Value	Why
Time	`O(m + n)`	Each character in `word` and `abbr` is processed at most once
Space	`O(1)`	Only pointers and a number accumulator are used

Here:

Symbol	Meaning
`m`	Length of `word`
`n`	Length of `abbr`

Implementation

class Solution:
    def validWordAbbreviation(self, word: str, abbr: str) -> bool:
        i = 0
        j = 0

        while i < len(word) and j < len(abbr):
            if abbr[j].isdigit():
                if abbr[j] == "0":
                    return False

                value = 0

                while j < len(abbr) and abbr[j].isdigit():
                    value = value * 10 + int(abbr[j])
                    j += 1

                i += value
            else:
                if word[i] != abbr[j]:
                    return False

                i += 1
                j += 1

        return i == len(word) and j == len(abbr)

Code Explanation

We start with two pointers:

i = 0
j = 0

The pointer i tracks where we are in word.

The pointer j tracks where we are in abbr.

The main loop runs while both strings still have characters available:

while i < len(word) and j < len(abbr):

If the abbreviation character is a digit:

if abbr[j].isdigit():

we first reject leading zero:

if abbr[j] == "0":
    return False

Then we parse the full number:

value = 0

while j < len(abbr) and abbr[j].isdigit():
    value = value * 10 + int(abbr[j])
    j += 1

After parsing the number, we skip that many characters in word:

i += value

If the abbreviation character is a letter, it must match the current word character:

if word[i] != abbr[j]:
    return False

Then both pointers move forward:

i += 1
j += 1

At the end, both strings must be fully consumed:

return i == len(word) and j == len(abbr)

Testing

def test_valid_word_abbreviation():
    s = Solution()

    assert s.validWordAbbreviation("internationalization", "i12iz4n") is True
    assert s.validWordAbbreviation("apple", "a2e") is False
    assert s.validWordAbbreviation("substitution", "s10n") is True
    assert s.validWordAbbreviation("substitution", "sub4u4") is True
    assert s.validWordAbbreviation("substitution", "12") is True
    assert s.validWordAbbreviation("substitution", "substitution") is True
    assert s.validWordAbbreviation("substitution", "s010n") is False
    assert s.validWordAbbreviation("substitution", "s0ubstitution") is False
    assert s.validWordAbbreviation("word", "4") is True
    assert s.validWordAbbreviation("word", "5") is False

    print("all tests passed")

Test Notes

Test	Why
`"internationalization"`, `"i12iz4n"`	Standard valid example
`"apple"`, `"a2e"`	Standard invalid example
`"substitution"`, `"s10n"`	Large middle skip
`"substitution"`, `"sub4u4"`	Multiple skips and letters
`"substitution"`, `"12"`	Whole word skipped
`"substitution"`, `"substitution"`	No abbreviation
`"s010n"`	Leading zero is invalid
`"s0ubstitution"`	Zero-length replacement is invalid
`"word"`, `"4"`	Exact full skip
`"word"`, `"5"`	Skip goes past the word