8.12 Tries

Many search problems involve prefixes.

8.12 Tries

Many search problems involve prefixes.

Examples include:

  • Autocomplete systems
  • Spell checkers
  • Search suggestions
  • Dictionary lookups
  • URL routing
  • IP routing tables
  • DNA sequence matching

A straightforward solution stores strings in a hash table and performs exact lookups. This works well when the entire key is known.

However, prefix-based operations are different.

Suppose a dictionary contains:

apple
application
apply
apt
banana

Given the prefix:

app

we want:

apple
application
apply

Hash tables do not naturally support this operation.

A Trie is a tree specifically designed for prefix queries.

Instead of storing complete strings in a single location, a Trie stores characters along paths. Strings with common prefixes share nodes, making prefix operations efficient and intuitive.

Problem

You need fast prefix lookups, prefix counting, or autocomplete operations on large collections of strings.

Solution

Represent each character as an edge in a tree.

Consider these words:

cat
car
care
dog

The Trie looks like:

root
├── c
│   └── a
│       ├── t
│       └── r
│           └── e
└── d
    └── o
        └── g

The prefix:

ca

appears only once.

The strings:

cat
car
care

share the same initial path.

A simple implementation:

package main

type TrieNode struct {
    Children map[rune]*TrieNode
    End      bool
}

func NewNode() *TrieNode {
    return &TrieNode{
        Children: make(map[rune]*TrieNode),
    }
}

The root node:

root := NewNode()

contains no character itself.

Characters live on edges between nodes.

Discussion

A Trie can be viewed as a tree representation of a set of strings.

Instead of:

"apple"

being stored as one object, it becomes:

a → p → p → l → e

Every node represents a prefix.

For example:

app

corresponds to an actual node inside the structure.

This property makes prefix operations efficient.

Inserting Words

Insertion walks the path and creates missing nodes.

func Insert(
    root *TrieNode,
    word string,
) {
    current := root

    for _, ch := range word {
        child, exists :=
            current.Children[ch]

        if !exists {
            child = NewNode()
            current.Children[ch] = child
        }

        current = child
    }

    current.End = true
}

Usage:

Insert(root, "cat")
Insert(root, "car")
Insert(root, "care")

The shared prefix:

ca

is stored once.

Only the diverging suffixes require additional nodes.

Searching follows the same path.

func Search(
    root *TrieNode,
    word string,
) bool {
    current := root

    for _, ch := range word {
        child, exists :=
            current.Children[ch]

        if !exists {
            return false
        }

        current = child
    }

    return current.End
}

Examples:

Search(root, "cat")

returns:

true

while:

Search(root, "cab")

returns:

false

The traversal follows characters directly.

No subtree exploration is required.

A major advantage of Tries is prefix lookup.

func HasPrefix(
    root *TrieNode,
    prefix string,
) bool {
    current := root

    for _, ch := range prefix {
        child, exists :=
            current.Children[ch]

        if !exists {
            return false
        }

        current = child
    }

    return true
}

Example:

HasPrefix(root, "ca")

returns:

true

because:

cat
car
care

all begin with:

ca

This operation requires only traversal of the prefix itself.

Autocomplete

Once the prefix node is located, every descendant represents a valid completion.

Suppose:

app

points to:

app
├── l
│   ├── e
│   └── y
└── lication

The completions are:

apple
apply
application

Autocomplete therefore becomes:

  1. Find prefix node.
  2. DFS through descendants.
  3. Collect words.

The tree structure makes this process natural.

Counting Prefixes

Many applications need counts.

For example:

How many words begin with "app"?

Store a counter in every node.

type TrieNode struct {
    Children map[rune]*TrieNode
    End      bool
    Count    int
}

During insertion:

current.Count++

Now:

prefix count

becomes:

O(length of prefix)

instead of scanning all words.

Fixed Alphabet Tries

Maps are flexible but have overhead.

For lowercase English letters:

type TrieNode struct {
    Children [26]*TrieNode
    End      bool
}

Index calculation:

index := ch - 'a'

Advantages:

  • Faster access
  • Better cache locality
  • Less allocation overhead

Disadvantages:

  • Wastes memory when branching is sparse

This tradeoff appears frequently in systems programming.

Memory Considerations

Tries trade memory for speed.

Consider:

apple
banana
orange

A hash table stores:

3 strings

A Trie stores:

many nodes

Each node contains child references.

Large dictionaries may require millions of nodes.

The memory cost is often the primary limitation of Trie-based systems.

Compression techniques exist to reduce this overhead.

Compressed Tries

A standard Trie may contain long chains.

Example:

root
 |
 a
 |
 p
 |
 p
 |
 l
 |
 e

Many nodes have only one child.

A compressed Trie merges these chains.

root
 |
apple

This significantly reduces memory usage.

Compressed Tries appear in:

  • Search engines
  • Routing tables
  • Large dictionaries

The idea remains the same.

Only the representation changes.

Tries and Routing

Internet routers perform longest-prefix matching.

Example:

192.168.0.0/16
192.168.1.0/24
192.168.1.128/25

A binary Trie stores bits instead of characters.

0
1
0
1
...

The longest matching path determines routing behavior.

This application demonstrates that Tries are not limited to text.

Any sequence can be stored.

Tries and Dynamic Programming

Many string algorithms combine Tries with DFS or dynamic programming.

Examples:

  • Word Break
  • Dictionary segmentation
  • Multi-pattern matching
  • Text correction

The Trie provides fast prefix discovery.

Dynamic programming handles combinatorial choices.

Together they form powerful text-processing systems.

Comparing Tries and Hash Tables

Operation Hash Table Trie
Exact search O(L) O(L)
Insert O(L) O(L)
Prefix search O(N) O(L)
Autocomplete Poor Excellent
Memory usage Lower Higher

Here:

  • L = string length
  • N = number of stored words

Hash tables excel at exact lookup.

Tries excel at prefix operations.

Complexity Analysis

Let:

  • L = string length

Insertion:

Operation Complexity
Insert O(L)
Search O(L)
Prefix lookup O(L)

Autocomplete:

Operation Complexity
Locate prefix O(L)
Enumerate results O(output size)

Memory:

Structure Complexity
Trie O(total characters)

The exact memory cost depends on branching representation.

Common Mistakes

A common mistake is forgetting the distinction between:

Path exists

and:

Word exists

Suppose only:

care

is inserted.

The path:

car

exists.

But:

Search("car")

should return:

false

unless the terminal marker is set.

Another mistake is using maps when performance is critical and the alphabet is fixed. Arrays often provide substantially better throughput.

A third mistake is storing very large alphabets directly in arrays. Unicode contains many possible characters, making sparse representations preferable.

Finally, some implementations recursively enumerate autocomplete results without limiting output size. A prefix with millions of descendants can generate unexpectedly large responses.

Takeaway

A Trie stores strings as paths in a tree. Shared prefixes become shared branches, making prefix operations efficient and natural. While Tries consume more memory than hash tables, they provide powerful capabilities that hash tables cannot easily support, including autocomplete, prefix counting, longest-prefix matching, and efficient dictionary traversal. The structure appears throughout search systems, networking, language processing, and large-scale text indexing.