7.12 Binary Search Trees
Before studying specific balancing algorithms such as AVL trees and red-black trees, it is worth understanding the structure they are trying to preserve.
7.12 Binary Search Trees
Before studying specific balancing algorithms such as AVL trees and red-black trees, it is worth understanding the structure they are trying to preserve.
The binary search tree (BST) is one of the most influential data structures in computer science. Many advanced structures are refinements of the BST idea rather than completely new inventions. AVL trees, red-black trees, treaps, splay trees, B-trees, and database indexes all build upon the same fundamental ordering principle.
A BST combines two concepts:
- Binary tree organization
- Ordered key relationships
This combination enables efficient searching, insertion, deletion, predecessor queries, successor queries, and range operations.
Problem
You need a dynamic ordered collection.
The collection should support:
- Fast lookup
- Fast insertion
- Fast deletion
- Ordered traversal
A simple array provides ordered traversal but expensive updates.
A linked list provides efficient local updates but slow searches.
The binary search tree attempts to provide both ordering and efficient navigation.
Understanding the BST Property
A binary search tree satisfies a simple rule.
For every node:
All values in the left subtree
are smaller.
All values in the right subtree
are larger.
Example:
50
/ \\n 25 75
/ \ / \\n 10 40 60 90
Observe:
10 < 25 < 40
and:
60 < 75 < 90
and:
everything left of 50 < 50
everything right of 50 > 50
The property applies recursively throughout the tree.
This recursive ordering is the foundation of every operation.
Tree Nodes
A typical BST node contains:
type Node struct {
Key int
Left *Node
Right *Node
}
Each node acts as a decision point.
Searching compares the target key with the current node and chooses one of two possible directions.
This divide-and-conquer behavior is what gives BSTs their efficiency.
Searching
Consider:
50
/ \\n 25 75
/ \ / \\n 10 40 60 90
Search for:
60
Start at:
50
Since:
60 > 50
move right.
75
Since:
60 < 75
move left.
60
Target found.
Path:
50 → 75 → 60
Only three nodes were examined.
Large portions of the tree were ignored.
Search Implementation
func Search(
root *Node,
key int,
) *Node {
for root != nil {
if key == root.Key {
return root
}
if key < root.Key {
root = root.Left
} else {
root = root.Right
}
}
return nil
}
The algorithm repeatedly discards half of the remaining search space.
Conceptually, this resembles binary search on an array.
The difference is that the ordering is represented through links rather than positions.
Insertion
Suppose the tree contains:
50
/ \\n 25 75
Insert:
60
Search path:
50 → 75
Since:
60 < 75
insert as the left child.
Result:
50
/ \\n 25 75
/
60
Insertion follows exactly the same decision process as search.
The difference is that the algorithm continues until it reaches an empty position.
Recursive Insertion
func Insert(
root *Node,
key int,
) *Node {
if root == nil {
return &Node{Key: key}
}
if key < root.Key {
root.Left =
Insert(root.Left, key)
} else if key > root.Key {
root.Right =
Insert(root.Right, key)
}
return root
}
The recursive structure mirrors the recursive definition of the tree itself.
This makes BST operations particularly elegant.
In-Order Traversal
One remarkable property of BSTs is that an in-order traversal automatically produces sorted output.
Traversal order:
Left
Node
Right
Example:
50
/ \\n 25 75
/ \ / \\n 10 40 60 90
Traversal:
10
25
40
50
60
75
90
Notice:
sorted order
No additional sorting is required.
This property explains why BSTs are useful for ordered collections.
Traversal Implementation
func InOrder(root *Node) {
if root == nil {
return
}
InOrder(root.Left)
fmt.Println(root.Key)
InOrder(root.Right)
}
The output appears in ascending order because of the BST property.
Finding the Minimum
The smallest value always lies at the leftmost node.
Example:
50
/ \\n 25 75
/ \\n 10 40
Move left repeatedly:
50
25
10
Result:
10
Implementation:
func Min(root *Node) *Node {
for root.Left != nil {
root = root.Left
}
return root
}
Finding the Maximum
The largest value lies at the rightmost node.
Implementation:
func Max(root *Node) *Node {
for root.Right != nil {
root = root.Right
}
return root
}
The logic mirrors minimum search.
Predecessors and Successors
Suppose:
10 20 30 40 50 60 70
Predecessor of:
50
is:
40
Successor:
60
BSTs support these operations naturally because ordering information is explicitly represented.
This capability is essential for schedulers, event systems, and database indexes.
Deletion
Deletion is the most complex BST operation.
Three cases exist.
Case 1: Leaf
50
/
25
Delete:
25
Simply remove the node.
Case 2: One Child
50
/
25
\\n 40
Delete:
25
Promote:
40
Case 3: Two Children
50
/ \\n 25 75
Delete:
50
Replace with either:
largest value in left subtree
or:
smallest value in right subtree
The replacement preserves ordering.
Why BSTs Work
The BST property partitions the search space.
At every node:
key < current
or:
key > current
One entire subtree can therefore be ignored.
This repeated elimination is identical in spirit to binary search.
The structure stores ordering information directly inside the topology of the tree.
The Fundamental Weakness
Consider inserting:
10
20
30
40
50
60
70
The tree becomes:
10
\\n 20
\\n 30
\\n 40
\\n 50
\\n 60
\\n 70
Height:
7
Search complexity:
O(n)
The binary search tree has effectively become a linked list.
This problem motivates balancing algorithms.
BSTs Versus Sorted Arrays
| Feature | Sorted Array | BST |
|---|---|---|
| Search | O(log n) | O(log n) average |
| Insert | O(n) | O(log n) average |
| Delete | O(n) | O(log n) average |
| Ordered traversal | O(n) | O(n) |
| Memory locality | Excellent | Moderate |
BSTs trade memory locality for dynamic updates.
When updates are frequent, this trade-off is often worthwhile.
Complexity Analysis
For a balanced tree:
| Operation | Complexity |
|---|---|
| Search | O(log n) |
| Insert | O(log n) |
| Delete | O(log n) |
| Min | O(log n) |
| Max | O(log n) |
For a degenerate tree:
| Operation | Complexity |
|---|---|
| Search | O(n) |
| Insert | O(n) |
| Delete | O(n) |
Performance depends entirely on height.
Common Mistakes
A common misconception is that all binary search trees automatically provide logarithmic performance.
Only well-shaped trees provide logarithmic behavior.
Another mistake is forgetting that insertion order affects structure. The same set of values can produce dramatically different trees depending on insertion sequence.
A third mistake is confusing tree traversal order with insertion order. In-order traversal produces sorted output regardless of insertion sequence.
Finally, many developers underestimate the complexity of deletion. Handling the two-child case correctly requires careful attention to ordering invariants.
Takeaway
The binary search tree is the foundational ordered tree structure. It stores ordering information directly in the tree topology, enabling efficient search, insertion, deletion, and ordered traversal. Its power comes from recursively partitioning the search space at every node. Its weakness is sensitivity to shape. When the tree becomes unbalanced, logarithmic performance disappears. The next sections introduce self-balancing variants that preserve the elegance of BST operations while guaranteeing logarithmic height.