Search in a B+ tree where all records are stored in leaves and internal nodes guide traversal.
B+ tree search operates on a multiway balanced tree where internal nodes store only routing keys, and all actual records reside in leaf nodes.
Leaves are linked in sorted order. This design improves range queries and sequential access while keeping search depth low.
Problem
Given a B+ tree root root and a target key x, locate the record associated with x.
If the key does not exist, return null.
Structure
Internal nodes contain sorted separator keys:
and child pointers:
Leaf nodes contain all data entries and are linked left to right.
The key ranges follow:
| child | key range |
|---|---|
c0 | keys smaller than k0 |
ci | keys between k{i-1} and ki |
cm | keys greater than k{m-1} |
Algorithm
Traverse internal nodes to locate the correct leaf. Then search inside the leaf.
bplus_tree_search(node, x):
while node.is_leaf == false:
i = upper_bound(node.keys, x)
node = node.children[i]
i = lower_bound(node.keys, x)
if i < length(node.keys) and node.keys[i] == x:
return node.values[i]
return nullExample
[20 | 40]
/ | \
[5 10] [20 25 30] [40 50 60]Search for 25:
| step | node | action |
|---|---|---|
| 1 | [20, 40] | move to middle child |
| 2 | leaf [20, 25, 30] | find 25 |
Correctness
Internal nodes guide the search by partitioning key ranges. At each level, exactly one child can contain the target key.
All actual keys are stored in leaves. Once the correct leaf is reached, searching that node determines existence.
If the key is found, the returned value matches the target. If it is not found in the leaf, no other node can contain it.
Complexity
Let be the branching factor.
| measure | cost |
|---|---|
| height | |
| node accesses |
Search inside a node:
| method | cost |
|---|---|
| binary search | |
| linear scan |
Total time:
Space complexity:
| version | space |
|---|---|
| iterative | |
| recursive |
When to Use
B+ tree search is widely used in database systems and storage engines.
It is appropriate when:
- range queries and scans are frequent
- data resides on disk or SSD
- high fanout reduces tree height
- sequential leaf traversal is needed
Compared to B trees, B+ trees store all data in leaves and link leaves together, which improves range performance and cache locality.
Implementation
from bisect import bisect_left, bisect_right
class BPlusNode:
def __init__(self, keys=None, children=None, values=None, is_leaf=True):
self.keys = keys or []
self.children = children or []
self.values = values or []
self.is_leaf = is_leaf
def bplus_tree_search(root, x):
node = root
while not node.is_leaf:
i = bisect_right(node.keys, x)
node = node.children[i]
i = bisect_left(node.keys, x)
if i < len(node.keys) and node.keys[i] == x:
return node.values[i]
return Nonetype BPlusNode struct {
Keys []int
Children []*BPlusNode
Values []int
IsLeaf bool
}
func lowerBound(keys []int, x int) int {
lo, hi := 0, len(keys)
for lo < hi {
mid := lo + (hi-lo)/2
if keys[mid] < x {
lo = mid + 1
} else {
hi = mid
}
}
return lo
}
func upperBound(keys []int, x int) int {
lo, hi := 0, len(keys)
for lo < hi {
mid := lo + (hi-lo)/2
if keys[mid] <= x {
lo = mid + 1
} else {
hi = mid
}
}
return lo
}
func BPlusTreeSearch(root *BPlusNode, x int) (int, bool) {
node := root
for !node.IsLeaf {
i := upperBound(node.Keys, x)
node = node.Children[i]
}
i := lowerBound(node.Keys, x)
if i < len(node.Keys) && node.Keys[i] == x {
return node.Values[i], true
}
return 0, false
}