Skip to content

2.20 Tries

A trie is a tree structure for storing a set of strings so that common prefixes are shared.

A trie is a tree structure for storing a set of strings so that common prefixes are shared. Each edge corresponds to a symbol, and each node represents a prefix. Tries support fast insertion, lookup, prefix queries, and lexicographic traversal.

Problem

Maintain a dynamic set of strings and support:

Insert(s)
Contains(s)
HasPrefix(p)
List all strings with prefix p

A naive approach compares strings directly and costs O(length) per comparison. A trie also costs O(length) per operation but can answer prefix queries efficiently and avoid repeated prefix comparisons across many strings.

Node Structure

For lowercase ASCII letters, each node can have up to 26 children.

type Node struct {
	next [26]*Node
	end  bool
}

type Trie struct {
	root *Node
}

func NewTrie() *Trie {
	return &Trie{root: &Node{}}
}

end marks that a complete word ends at this node.

Insert

func (t *Trie) Insert(s string) {
	cur := t.root

	for i := 0; i < len(s); i++ {
		idx := s[i] - 'a'

		if cur.next[idx] == nil {
			cur.next[idx] = &Node{}
		}

		cur = cur.next[idx]
	}

	cur.end = true
}

Invariant:

After processing s[0:i], cur represents that prefix in the trie

Search Exact String

func (t *Trie) Contains(s string) bool {
	cur := t.root

	for i := 0; i < len(s); i++ {
		idx := s[i] - 'a'

		if cur.next[idx] == nil {
			return false
		}

		cur = cur.next[idx]
	}

	return cur.end
}

The end flag ensures that "app" is distinguished from "apple".

Prefix Query

func (t *Trie) HasPrefix(p string) bool {
	cur := t.root

	for i := 0; i < len(p); i++ {
		idx := p[i] - 'a'

		if cur.next[idx] == nil {
			return false
		}

		cur = cur.next[idx]
	}

	return true
}

Invariant:

If traversal reaches depth i, prefix p[0:i] exists in the trie

Collect Words with a Prefix

First reach the node representing the prefix, then perform DFS.

func (t *Trie) WordsWithPrefix(p string) []string {
	cur := t.root

	for i := 0; i < len(p); i++ {
		idx := p[i] - 'a'

		if cur.next[idx] == nil {
			return nil
		}

		cur = cur.next[idx]
	}

	var out []string
	var dfs func(*Node, []byte)

	dfs = func(n *Node, path []byte) {
		if n.end {
			out = append(out, string(path))
		}

		for i := 0; i < 26; i++ {
			if n.next[i] != nil {
				dfs(n.next[i], append(path, byte('a'+i)))
			}
		}
	}

	dfs(cur, []byte(p))
	return out
}

Invariant:

path always represents the string corresponding to the current node

Complexity

Let L be the length of the string.

Insert:      O(L)
Contains:    O(L)
HasPrefix:   O(L)
Enumeration: O(total output size)
Space:       O(total characters inserted)

Each node corresponds to a unique prefix.

Space Considerations

A dense [26]*Node array is fast but can waste memory when many children are unused.

Alternative: use a map.

type Node struct {
	next map[byte]*Node
	end  bool
}

Trade-off:

array: faster, predictable memory
map:   smaller memory for sparse data

Deletion

To delete a string, clear the end flag and optionally prune nodes.

func (t *Trie) Delete(s string) {
	t.delete(t.root, s, 0)
}

func (t *Trie) delete(n *Node, s string, i int) bool {
	if i == len(s) {
		if !n.end {
			return false
		}
		n.end = false
		return isEmpty(n)
	}

	idx := s[i] - 'a'
	child := n.next[idx]

	if child == nil {
		return false
	}

	shouldDelete := t.delete(child, s, i+1)

	if shouldDelete {
		n.next[idx] = nil
		return !n.end && isEmpty(n)
	}

	return false
}

func isEmpty(n *Node) bool {
	for i := 0; i < 26; i++ {
		if n.next[i] != nil {
			return false
		}
	}
	return true
}

Deletion removes nodes that are no longer needed.

Compressed Trie (Radix Tree)

Instead of one character per edge, store strings on edges.

This reduces height and memory usage when many nodes have a single child.

Example:

"apple"
"app"

can share a single path "app".

Radix trees are more complex but more memory-efficient.

Trie for Integers (Bitwise Trie)

Tries can store integers by bits.

Each level corresponds to a bit position.

Used in problems like:

maximum XOR pair
type BitNode struct {
	next [2]*BitNode
}

Traverse from most significant bit to least significant bit.

Common Pitfalls

Not marking word termination leads to prefix confusion.

Assuming fixed alphabet without validating input can cause index errors.

Using recursion for DFS without considering stack depth for large tries.

Memory usage grows with total inserted characters. Tries are not suitable for very large vocabularies without compression.

Mixing byte-based indexing with Unicode strings produces incorrect results for non-ASCII input.

Design Pattern

Trie operations follow:

  1. Traverse by characters
  2. Create nodes on demand
  3. Use end to mark complete entries
  4. Use DFS for enumeration

The invariant is always that a path from root to a node represents a prefix.

Takeaway

A trie organizes strings by shared prefixes. It provides efficient prefix queries and predictable O(L) operations. Use array-backed nodes for small alphabets and maps for sparse ones. For large datasets, consider compressed tries or alternative structures such as hash tables or suffix arrays.