A clear explanation of finding the longest buildable word using sorting and a hash set.
Problem Restatement
We are given an array of strings:
wordsWe need to find the longest word that can be built one character at a time by other words in the array.
A word is valid if every prefix formed by removing the last character also exists in the array.
For example:
"world"is valid only if all these words exist:
"w"
"wo"
"wor"
"worl"
"world"If there are multiple answers with the same length, return the lexicographically smallest one.
If no valid answer exists, return the empty string.
The official statement defines the valid word condition exactly this way and asks for the longest such word. (leetcode.com)
Input and Output
| Item | Meaning |
|---|---|
| Input | Array of strings words |
| Output | Longest buildable word |
| Valid condition | Every prefix must also exist |
| Tie rule | Return lexicographically smallest |
| No valid word | Return "" |
The function shape is:
class Solution:
def longestWord(self, words: list[str]) -> str:
...Examples
Example 1:
words = ["w","wo","wor","worl","world"]The word:
"world"is valid because all prefixes exist.
Output:
"world"Example 2:
words = ["a","banana","app","appl","ap","apply","apple"]Both:
"apple"
"apply"are valid.
They have the same length.
Lexicographically:
"apple" < "apply"So the answer is:
"apple"First Thought: Check Every Prefix
For every word:
- Generate all prefixes.
- Check whether each prefix exists in the dictionary.
For example:
"apple"requires checking:
"a"
"ap"
"app"
"appl"This works.
The main requirement is fast prefix lookup.
Key Insight
Use a hash set for constant-time word lookup.
If we store all words in:
word_setthen checking whether a prefix exists becomes:
prefix in word_setWe can directly test every word.
While scanning:
- Prefer longer valid words.
- For equal lengths, prefer lexicographically smaller words.
Algorithm
- Convert
wordsinto a set. - Initialize:
answer = "" - For every word:
- Check whether all prefixes exist.
- If valid:
- Update the answer if:
- The word is longer, or
- The same length but lexicographically smaller.
- Update the answer if:
- Return the answer.
To check validity:
For every prefix length from 1 to len(word) - 1:
word[:i]must exist in the set.
Correctness
The algorithm checks every word in the input.
A word is accepted only if every shorter prefix exists in the set. Therefore every accepted word satisfies the required buildability condition.
If a word is missing even one prefix, the algorithm rejects it. Therefore no invalid word is accepted.
Among all valid words, the algorithm keeps the best candidate according to the problem rules. It replaces the current answer when it finds:
- A longer valid word.
- A lexicographically smaller valid word of the same length.
Thus after processing all words, the stored answer is exactly the longest valid word, with lexicographical tie-breaking handled correctly.
Complexity
Let:
n = number of words
m = maximum word length| Metric | Value | Why |
|---|---|---|
| Time | O(n * m^2) | Each prefix slice may cost O(m) |
| Space | O(nm) | Store all words in the hash set |
The practical runtime is fast because word lengths are small in the problem constraints.
Implementation
class Solution:
def longestWord(self, words: list[str]) -> str:
word_set = set(words)
answer = ""
for word in words:
valid = True
for i in range(1, len(word)):
if word[:i] not in word_set:
valid = False
break
if valid:
if (
len(word) > len(answer)
or (
len(word) == len(answer)
and word < answer
)
):
answer = word
return answerCode Explanation
Store all words for fast lookup:
word_set = set(words)Track the current best answer:
answer = ""For each word, assume it is valid:
valid = TrueThen check all shorter prefixes:
for i in range(1, len(word)):If a prefix is missing:
if word[:i] not in word_set:the word cannot be built step by step:
valid = False
breakIf the word is valid, update the answer if it is better:
if (
len(word) > len(answer)
or (
len(word) == len(answer)
and word < answer
)
):
answer = wordFinally:
return answerAlternative: Sort and Build Incrementally
Another elegant approach:
- Sort words lexicographically.
- Maintain a set of buildable words.
- A word is buildable if:
- Its length is
1, or - Its prefix without the last character is already buildable.
- Its length is
class Solution:
def longestWord(self, words: list[str]) -> str:
words.sort()
buildable = set()
answer = ""
for word in words:
if len(word) == 1 or word[:-1] in buildable:
buildable.add(word)
if len(word) > len(answer):
answer = word
return answerThis works because sorting lexicographically automatically handles tie-breaking.
Alternative Complexity
| Metric | Value |
|---|---|
| Time | O(n log n + nm) |
| Space | O(nm) |
This version is often considered cleaner.
Example Walkthrough
Use:
words = ["a","banana","app","appl","ap","apply","apple"]Check:
"banana"Missing prefixes:
"b"
"ba"
...So it is invalid.
Check:
"apple"Prefixes:
"a"
"ap"
"app"
"appl"All exist.
So "apple" is valid.
Check:
"apply"Prefixes:
"a"
"ap"
"app"
"appl"All exist.
Both "apple" and "apply" have length 5.
Lexicographically:
"apple" < "apply"So the answer remains:
"apple"Trie Perspective
This problem is also naturally solved with a trie.
In a trie:
- Each node represents a prefix.
- A word is valid only if every traversed node is marked as a complete word.
Trie solutions are useful when:
- Many prefix operations are needed.
- The dictionary is large.
- We want efficient incremental prefix traversal.
For this problem, the hash set solution is shorter and simpler.
Testing
def test_longest_word():
s = Solution()
assert s.longestWord(
["w","wo","wor","worl","world"]
) == "world"
assert s.longestWord(
["a","banana","app","appl","ap","apply","apple"]
) == "apple"
assert s.longestWord(
["b","br","bre","brea","break","breakf","breakfa"]
) == "breakfa"
assert s.longestWord(
["a","b","ab","abc"]
) == "abc"
assert s.longestWord(
["banana"]
) == ""
assert s.longestWord(
["a","ar","art","arts","arty"]
) == "arts"
print("all tests passed")
test_longest_word()Test coverage:
| Test | Why |
|---|---|
| Standard chain | Confirms step-by-step buildability |
| Lexicographical tie | Confirms tie-breaking rule |
| Long valid chain | Confirms repeated prefix checking |
| Multiple starting letters | Confirms independent chains |
| No valid long word | Confirms empty result |
| Same-length candidates | Confirms smallest lexicographical answer |