Skip to content

LeetCode 642: Design Search Autocomplete System

A trie-based design for returning the top three historical sentences for a typed prefix.

Problem Restatement

We need to design an autocomplete system.

The system starts with historical sentences and their usage counts.

For every typed character except #, return up to three historical sentences that start with the current prefix.

The result must be sorted by:

PriorityRule
FirstHigher frequency first
SecondASCII order if frequencies are equal

When the user types #, the current sentence is finished. The system stores it, increases its frequency by 1, resets the current input, and returns an empty list.

Input and Output

Class shape:

class AutocompleteSystem:

    def __init__(self, sentences: list[str], times: list[int]):
        ...

    def input(self, c: str) -> list[str]:
        ...

Rules:

ItemMeaning
sentences[i]Historical sentence
times[i]How many times that sentence was typed
input(c)Process one typed character
#End of current sentence
Return valueTop 3 matching sentences, or [] for #

Example

Initialization:

sentences = ["i love you", "island", "iroman", "i love leetcode"]
times = [5, 3, 2, 2]
system = AutocompleteSystem(sentences, times)

Input:

system.input("i")

Current prefix:

"i"

Matching sentences:

SentenceFrequency
"i love you"5
"island"3
"iroman"2
"i love leetcode"2

The top 3 are:

["i love you", "island", "i love leetcode"]

"i love leetcode" comes before "iroman" because both have frequency 2, and ASCII order puts the space before r.

Next:

system.input(" ")

Current prefix:

"i "

Matching sentences:

["i love you", "i love leetcode"]

Next:

system.input("a")

Current prefix:

"i a"

No historical sentence starts with "i a".

So the result is:

[]

Next:

system.input("#")

The sentence "i a" is stored with frequency 1, and the result is:

[]

First Thought: Scan Every Sentence

A simple approach is to keep a hash map:

sentence -> frequency

For each input character, scan every sentence and check whether it starts with the current prefix.

Then sort all matches by:

(-frequency, sentence)

This is easy to implement, but each query may scan all sentences. A trie gives a cleaner prefix-based design.

Key Insight

A trie stores strings by prefix.

Each node represents one prefix. For example, after typing:

"i "

we can walk from the trie root through 'i', then through ' '.

The node we reach represents all sentences that start with "i ".

To make retrieval fast and simple, store at each trie node a map of all sentences that share that prefix:

sentence -> frequency

Then input(c) only needs to:

  1. Extend the current prefix.
  2. Move to the trie node for that prefix.
  3. Sort that node’s sentence map by frequency and ASCII order.
  4. Return the first three sentences.

When a completed sentence is added, we insert it into the trie and increase its count at every prefix node.

Data Structures

Each trie node stores:

children
counts
FieldMeaning
childrenCharacter to next trie node
countsSentences under this prefix and their frequencies

The system stores:

FieldMeaning
rootRoot trie node
currentCurrent typed prefix
current_nodeTrie node for current prefix, or None if no match exists

Using current_node avoids restarting from the root for every typed character.

Algorithm

Constructor:

  1. Create the trie root.
  2. For each (sentence, time) pair, insert the sentence with count time.
  3. Set the current prefix to empty.
  4. Set the current trie node to root.

Insert helper:

  1. Start at the root.
  2. For each character in the sentence:
    1. Create the child node if missing.
    2. Move to the child.
    3. Add the sentence count to this node’s counts.

input(c):

  1. If c == "#":
    1. Insert the current sentence with count 1.
    2. Reset current prefix.
    3. Reset current node to root.
    4. Return [].
  2. Otherwise:
    1. Append c to the current prefix.
    2. If current_node is None, return [].
    3. If c is not a child of current_node, set current_node = None and return [].
    4. Move to the child node.
    5. Sort candidates by (-count, sentence).
    6. Return the first three sentences.

Implementation

class TrieNode:
    def __init__(self):
        self.children = {}
        self.counts = {}

class AutocompleteSystem:

    def __init__(self, sentences: list[str], times: list[int]):
        self.root = TrieNode()
        self.current = ""
        self.current_node = self.root

        for sentence, count in zip(sentences, times):
            self._insert(sentence, count)

    def input(self, c: str) -> list[str]:
        if c == "#":
            self._insert(self.current, 1)
            self.current = ""
            self.current_node = self.root
            return []

        self.current += c

        if self.current_node is None:
            return []

        if c not in self.current_node.children:
            self.current_node = None
            return []

        self.current_node = self.current_node.children[c]

        candidates = self.current_node.counts.items()
        ordered = sorted(candidates, key=lambda item: (-item[1], item[0]))

        return [sentence for sentence, count in ordered[:3]]

    def _insert(self, sentence: str, count: int) -> None:
        node = self.root

        for ch in sentence:
            if ch not in node.children:
                node.children[ch] = TrieNode()

            node = node.children[ch]
            node.counts[sentence] = node.counts.get(sentence, 0) + count

Code Explanation

Each trie node has:

self.children = {}
self.counts = {}

The children dictionary stores outgoing edges by character.

The counts dictionary stores every historical sentence that has this node’s prefix.

The constructor inserts all initial sentences:

for sentence, count in zip(sentences, times):
    self._insert(sentence, count)

When inserting a sentence, each prefix node gets its count updated:

node.counts[sentence] = node.counts.get(sentence, 0) + count

For example, inserting "island" updates the nodes for:

i
is
isl
isla
islan
island

When the user types a normal character, the system extends the current prefix:

self.current += c

If the current trie path no longer exists, no sentence can match this prefix:

if c not in self.current_node.children:
    self.current_node = None
    return []

If the node exists, we sort its candidates:

ordered = sorted(candidates, key=lambda item: (-item[1], item[0]))

This sorts by higher frequency first and then by smaller ASCII order.

When # is typed, the current sentence is added:

self._insert(self.current, 1)

Then the input state resets.

Correctness

Every inserted sentence is added to each trie node corresponding to one of its prefixes. Therefore, for any prefix represented by a trie node, that node’s counts dictionary contains exactly the historical sentences that start with that prefix, along with their correct frequencies.

During input, current_node tracks the trie node for the current typed prefix. If no such node exists, then no historical sentence starts with that prefix, so returning an empty list is correct.

If the node exists, all matching sentences are in current_node.counts. Sorting these candidates by negative frequency and then by sentence gives exactly the required ordering: hotter sentences first, and ASCII order for ties. Returning the first three gives the required top three suggestions.

When # is typed, the current sentence is completed. Inserting it with count 1 updates every prefix node for that sentence, so future queries include it with the correct frequency. Resetting the current prefix and current node starts a new input session.

Therefore, all operations satisfy the required autocomplete behavior.

Complexity

Let:

L = length of the inserted sentence
M = number of matching sentences for the current prefix
OperationTimeSpace
ConstructorO(total inserted characters) plus prefix mapsTrie construction
_insertO(L)Stores the sentence in each prefix node
input(c) normal characterO(M log M)Sorting matching candidates
input("#")O(L)Inserts the completed sentence

The trie may store the same sentence reference in many prefix nodes. This uses more memory, but it makes lookup logic simple.

Alternative: DFS From Prefix Node

Instead of storing all candidate sentences at every node, each node can store only children and an optional terminal sentence count.

Then for every input prefix:

  1. Move to the prefix node.
  2. Run DFS below it.
  3. Collect all completed sentences.
  4. Sort and return the top three.

This uses less prefix-map memory, but each query may traverse a large subtree.

For this problem, storing prefix candidate maps is usually simpler and fast enough.

Testing

def run_tests():
    system = AutocompleteSystem(
        ["i love you", "island", "iroman", "i love leetcode"],
        [5, 3, 2, 2],
    )

    assert system.input("i") == ["i love you", "island", "i love leetcode"]
    assert system.input(" ") == ["i love you", "i love leetcode"]
    assert system.input("a") == []
    assert system.input("#") == []

    assert system.input("i") == ["i love you", "island", "i love leetcode"]
    assert system.input(" ") == ["i love you", "i love leetcode", "i a"]
    assert system.input("a") == ["i a"]
    assert system.input("#") == []

    system = AutocompleteSystem(["abc", "abb", "abd"], [2, 2, 2])
    assert system.input("a") == ["abb", "abc", "abd"]

    print("all tests passed")

run_tests()

Test meaning:

TestWhy
Official-style input sequenceChecks prefix suggestions and # insertion
Re-query after insertionNew sentence appears in future results
Tie frequencyASCII order decides ranking
Missing prefixReturns empty list