A trie-based design for returning the top three historical sentences for a typed prefix.
Problem Restatement
We need to design an autocomplete system.
The system starts with historical sentences and their usage counts.
For every typed character except #, return up to three historical sentences that start with the current prefix.
The result must be sorted by:
| Priority | Rule |
|---|---|
| First | Higher frequency first |
| Second | ASCII order if frequencies are equal |
When the user types #, the current sentence is finished. The system stores it, increases its frequency by 1, resets the current input, and returns an empty list.
Input and Output
Class shape:
class AutocompleteSystem:
def __init__(self, sentences: list[str], times: list[int]):
...
def input(self, c: str) -> list[str]:
...Rules:
| Item | Meaning |
|---|---|
sentences[i] | Historical sentence |
times[i] | How many times that sentence was typed |
input(c) | Process one typed character |
# | End of current sentence |
| Return value | Top 3 matching sentences, or [] for # |
Example
Initialization:
sentences = ["i love you", "island", "iroman", "i love leetcode"]
times = [5, 3, 2, 2]
system = AutocompleteSystem(sentences, times)Input:
system.input("i")Current prefix:
"i"Matching sentences:
| Sentence | Frequency |
|---|---|
"i love you" | 5 |
"island" | 3 |
"iroman" | 2 |
"i love leetcode" | 2 |
The top 3 are:
["i love you", "island", "i love leetcode"]"i love leetcode" comes before "iroman" because both have frequency 2, and ASCII order puts the space before r.
Next:
system.input(" ")Current prefix:
"i "Matching sentences:
["i love you", "i love leetcode"]Next:
system.input("a")Current prefix:
"i a"No historical sentence starts with "i a".
So the result is:
[]Next:
system.input("#")The sentence "i a" is stored with frequency 1, and the result is:
[]First Thought: Scan Every Sentence
A simple approach is to keep a hash map:
sentence -> frequencyFor each input character, scan every sentence and check whether it starts with the current prefix.
Then sort all matches by:
(-frequency, sentence)This is easy to implement, but each query may scan all sentences. A trie gives a cleaner prefix-based design.
Key Insight
A trie stores strings by prefix.
Each node represents one prefix. For example, after typing:
"i "we can walk from the trie root through 'i', then through ' '.
The node we reach represents all sentences that start with "i ".
To make retrieval fast and simple, store at each trie node a map of all sentences that share that prefix:
sentence -> frequencyThen input(c) only needs to:
- Extend the current prefix.
- Move to the trie node for that prefix.
- Sort that node’s sentence map by frequency and ASCII order.
- Return the first three sentences.
When a completed sentence is added, we insert it into the trie and increase its count at every prefix node.
Data Structures
Each trie node stores:
children
counts| Field | Meaning |
|---|---|
children | Character to next trie node |
counts | Sentences under this prefix and their frequencies |
The system stores:
| Field | Meaning |
|---|---|
root | Root trie node |
current | Current typed prefix |
current_node | Trie node for current prefix, or None if no match exists |
Using current_node avoids restarting from the root for every typed character.
Algorithm
Constructor:
- Create the trie root.
- For each
(sentence, time)pair, insert the sentence with counttime. - Set the current prefix to empty.
- Set the current trie node to root.
Insert helper:
- Start at the root.
- For each character in the sentence:
- Create the child node if missing.
- Move to the child.
- Add the sentence count to this node’s
counts.
input(c):
- If
c == "#":- Insert the current sentence with count
1. - Reset current prefix.
- Reset current node to root.
- Return
[].
- Insert the current sentence with count
- Otherwise:
- Append
cto the current prefix. - If
current_nodeisNone, return[]. - If
cis not a child ofcurrent_node, setcurrent_node = Noneand return[]. - Move to the child node.
- Sort candidates by
(-count, sentence). - Return the first three sentences.
- Append
Implementation
class TrieNode:
def __init__(self):
self.children = {}
self.counts = {}
class AutocompleteSystem:
def __init__(self, sentences: list[str], times: list[int]):
self.root = TrieNode()
self.current = ""
self.current_node = self.root
for sentence, count in zip(sentences, times):
self._insert(sentence, count)
def input(self, c: str) -> list[str]:
if c == "#":
self._insert(self.current, 1)
self.current = ""
self.current_node = self.root
return []
self.current += c
if self.current_node is None:
return []
if c not in self.current_node.children:
self.current_node = None
return []
self.current_node = self.current_node.children[c]
candidates = self.current_node.counts.items()
ordered = sorted(candidates, key=lambda item: (-item[1], item[0]))
return [sentence for sentence, count in ordered[:3]]
def _insert(self, sentence: str, count: int) -> None:
node = self.root
for ch in sentence:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.counts[sentence] = node.counts.get(sentence, 0) + countCode Explanation
Each trie node has:
self.children = {}
self.counts = {}The children dictionary stores outgoing edges by character.
The counts dictionary stores every historical sentence that has this node’s prefix.
The constructor inserts all initial sentences:
for sentence, count in zip(sentences, times):
self._insert(sentence, count)When inserting a sentence, each prefix node gets its count updated:
node.counts[sentence] = node.counts.get(sentence, 0) + countFor example, inserting "island" updates the nodes for:
i
is
isl
isla
islan
islandWhen the user types a normal character, the system extends the current prefix:
self.current += cIf the current trie path no longer exists, no sentence can match this prefix:
if c not in self.current_node.children:
self.current_node = None
return []If the node exists, we sort its candidates:
ordered = sorted(candidates, key=lambda item: (-item[1], item[0]))This sorts by higher frequency first and then by smaller ASCII order.
When # is typed, the current sentence is added:
self._insert(self.current, 1)Then the input state resets.
Correctness
Every inserted sentence is added to each trie node corresponding to one of its prefixes. Therefore, for any prefix represented by a trie node, that node’s counts dictionary contains exactly the historical sentences that start with that prefix, along with their correct frequencies.
During input, current_node tracks the trie node for the current typed prefix. If no such node exists, then no historical sentence starts with that prefix, so returning an empty list is correct.
If the node exists, all matching sentences are in current_node.counts. Sorting these candidates by negative frequency and then by sentence gives exactly the required ordering: hotter sentences first, and ASCII order for ties. Returning the first three gives the required top three suggestions.
When # is typed, the current sentence is completed. Inserting it with count 1 updates every prefix node for that sentence, so future queries include it with the correct frequency. Resetting the current prefix and current node starts a new input session.
Therefore, all operations satisfy the required autocomplete behavior.
Complexity
Let:
L = length of the inserted sentence
M = number of matching sentences for the current prefix| Operation | Time | Space |
|---|---|---|
| Constructor | O(total inserted characters) plus prefix maps | Trie construction |
_insert | O(L) | Stores the sentence in each prefix node |
input(c) normal character | O(M log M) | Sorting matching candidates |
input("#") | O(L) | Inserts the completed sentence |
The trie may store the same sentence reference in many prefix nodes. This uses more memory, but it makes lookup logic simple.
Alternative: DFS From Prefix Node
Instead of storing all candidate sentences at every node, each node can store only children and an optional terminal sentence count.
Then for every input prefix:
- Move to the prefix node.
- Run DFS below it.
- Collect all completed sentences.
- Sort and return the top three.
This uses less prefix-map memory, but each query may traverse a large subtree.
For this problem, storing prefix candidate maps is usually simpler and fast enough.
Testing
def run_tests():
system = AutocompleteSystem(
["i love you", "island", "iroman", "i love leetcode"],
[5, 3, 2, 2],
)
assert system.input("i") == ["i love you", "island", "i love leetcode"]
assert system.input(" ") == ["i love you", "i love leetcode"]
assert system.input("a") == []
assert system.input("#") == []
assert system.input("i") == ["i love you", "island", "i love leetcode"]
assert system.input(" ") == ["i love you", "i love leetcode", "i a"]
assert system.input("a") == ["i a"]
assert system.input("#") == []
system = AutocompleteSystem(["abc", "abb", "abd"], [2, 2, 2])
assert system.input("a") == ["abb", "abc", "abd"]
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
| Official-style input sequence | Checks prefix suggestions and # insertion |
| Re-query after insertion | New sentence appears in future results |
| Tie frequency | ASCII order decides ranking |
| Missing prefix | Returns empty list |