A hash map design for checking whether a word's abbreviation is unique in a dictionary.
Problem Restatement
We are given a dictionary of words.
We need to design a class that can answer this query:
isUnique(word)It returns whether word has a unique abbreviation in the dictionary.
The abbreviation rule is:
If the word length is less than 3, the abbreviation is the word itself.
Otherwise:
first letter + number of middle letters + last letterFor example:
"dog" -> "d1g"
"internationalization" -> "i18n"
"it" -> "it"A word is unique if either:
- No dictionary word has the same abbreviation.
- The only dictionary words with that abbreviation are exactly the same as
word.
The source constraints include dictionary size up to 3 * 10^4, word length up to 20, and at most 5000 calls to isUnique.
Input and Output
| Item | Meaning |
|---|---|
| Constructor input | A list of dictionary words |
| Query input | A word |
| Output | True if the word abbreviation is unique, otherwise False |
| Abbreviation | word itself if length < 3, otherwise first + middle_count + last |
Class shape:
class ValidWordAbbr:
def __init__(self, dictionary: list[str]):
...
def isUnique(self, word: str) -> bool:
...Examples
For:
dictionary = ["deer", "door", "cake", "card"]The abbreviations are:
| Word | Abbreviation |
|---|---|
"deer" | "d2r" |
"door" | "d2r" |
"cake" | "c2e" |
"card" | "c2d" |
Now:
isUnique("dear") -> Falsebecause:
"dear" -> "d2r"and "deer" and "door" already have abbreviation "d2r".
isUnique("cart") -> Truebecause:
"cart" -> "c2t"and no dictionary word has abbreviation "c2t".
isUnique("cane") -> Falsebecause:
"cane" -> "c2e"
"cake" -> "c2e"They are different words with the same abbreviation.
isUnique("make") -> Truebecause no dictionary word has abbreviation "m2e".
First Thought: Scan the Dictionary Every Time
A direct solution stores the dictionary as a list.
For every isUnique(word) query:
- Compute the abbreviation of
word. - Scan every dictionary word.
- If another word has the same abbreviation, return
False. - Otherwise return
True.
Code:
class ValidWordAbbr:
def __init__(self, dictionary: list[str]):
self.dictionary = dictionary
def isUnique(self, word: str) -> bool:
target = self.abbr(word)
for other in self.dictionary:
if other != word and self.abbr(other) == target:
return False
return True
def abbr(self, word: str) -> str:
if len(word) < 3:
return word
return word[0] + str(len(word) - 2) + word[-1]This is correct, but each query scans the whole dictionary.
With many calls to isUnique, this repeats the same abbreviation work again and again.
Key Insight
Precompute a hash map from abbreviation to the set of dictionary words that produce it.
Example:
dictionary = ["deer", "door", "cake", "card"]Build:
{
"d2r": {"deer", "door"},
"c2e": {"cake"},
"c2d": {"card"},
}Then for a query word:
- Compute its abbreviation.
- Look up the set of dictionary words with that abbreviation.
- The word is unique if:
- the abbreviation does not exist, or
- the set contains only that exact word.
This also handles duplicate dictionary entries.
For example:
dictionary = ["door", "door"]The set for "d2r" is:
{"door"}So:
isUnique("door") -> Truebecause there is no different dictionary word with the same abbreviation.
Algorithm
During initialization:
- Create a hash map:
abbr_to_words = {}- For each word in the dictionary:
- Compute its abbreviation.
- Add the word to the set for that abbreviation.
For isUnique(word):
- Compute
key = abbr(word). - If
keydoes not exist in the map, returnTrue. - Otherwise, return
Trueonly if the set is exactly{word}.
Correctness
For each abbreviation, the hash map stores exactly the set of dictionary words that produce that abbreviation.
When isUnique(word) is called, any possible conflict must come from dictionary words stored under the same abbreviation as word.
If the abbreviation is absent from the map, no dictionary word has that abbreviation, so word is unique.
If the abbreviation exists and its set contains only word, then every dictionary word with that abbreviation is the same as word. So there is no other word from the dictionary with the same abbreviation.
If the set contains any word different from word, then another dictionary word shares the abbreviation, so word is not unique.
Therefore the algorithm returns exactly the required answer.
Complexity
Let:
N = number of words in dictionary
L = maximum word length| Operation | Time | Space | Why |
|---|---|---|---|
| Constructor | O(N * L) | O(N * L) | Builds abbreviations and stores distinct words |
isUnique | O(L) | O(1) | Computes one abbreviation and checks one set |
Since LeetCode word length is at most 20, isUnique is effectively constant time.
Implementation
from collections import defaultdict
class ValidWordAbbr:
def __init__(self, dictionary: list[str]):
self.abbr_to_words = defaultdict(set)
for word in dictionary:
key = self.abbr(word)
self.abbr_to_words[key].add(word)
def isUnique(self, word: str) -> bool:
key = self.abbr(word)
if key not in self.abbr_to_words:
return True
words = self.abbr_to_words[key]
return len(words) == 1 and word in words
def abbr(self, word: str) -> str:
if len(word) < 3:
return word
return word[0] + str(len(word) - 2) + word[-1]Code Explanation
We store abbreviation groups here:
self.abbr_to_words = defaultdict(set)Each abbreviation maps to a set, not a list.
This matters because duplicate dictionary words should not create false conflicts.
For each dictionary word:
key = self.abbr(word)
self.abbr_to_words[key].add(word)we compute its abbreviation and place it in the correct group.
For each query:
key = self.abbr(word)we compute the abbreviation of the query word.
If the abbreviation was never seen:
if key not in self.abbr_to_words:
return Truethen the word is unique.
Otherwise, we inspect the group:
words = self.abbr_to_words[key]The query is unique only when this group contains exactly one distinct word, and that word equals the query:
return len(words) == 1 and word in wordsThe abbreviation helper follows the problem rule:
if len(word) < 3:
return wordOtherwise:
return word[0] + str(len(word) - 2) + word[-1]Testing
def test_valid_word_abbr():
v = ValidWordAbbr(["deer", "door", "cake", "card"])
assert v.isUnique("dear") is False
assert v.isUnique("cart") is True
assert v.isUnique("cane") is False
assert v.isUnique("make") is True
v = ValidWordAbbr(["door", "door"])
assert v.isUnique("door") is True
assert v.isUnique("dear") is False
v = ValidWordAbbr(["it", "is"])
assert v.isUnique("it") is True
assert v.isUnique("if") is True
assert v.isUnique("is") is True
print("all tests passed")
test_valid_word_abbr()Test meaning:
| Test | Why |
|---|---|
"dear" | Same abbreviation as different dictionary words |
"cart" | Abbreviation absent from dictionary |
"cane" | Same abbreviation as "cake" |
"make" | No conflict |
Duplicate "door" entries | Confirms duplicates do not create false conflict |
| Short words | Confirms words of length less than 3 abbreviate to themselves |