7.17 B-Trees

Binary search trees work well in memory, where following a pointer from one node to another is relatively cheap.

7.17 B-Trees

Binary search trees work well in memory, where following a pointer from one node to another is relatively cheap. Storage systems have a different cost model. Reading one value at a time from disk or flash is expensive compared with reading a block of adjacent values.

B-trees solve this problem by making each node hold many keys and many children. Instead of branching two ways at each node, a B-tree branches many ways. This reduces height dramatically and turns searches into a small number of block reads.

That design explains why B-trees and their variants dominate database indexes, filesystems, embedded key-value stores, and storage engines.

Problem

Suppose you have a billion records stored on disk.

A balanced binary search tree has height:

O(log₂ n)

For one billion keys, that is roughly thirty levels.

In memory, thirty pointer hops may be acceptable.

On disk, thirty random reads are expensive.

We need a search tree that keeps height much smaller by packing many keys into each node.

Solution

Use a B-tree.

A B-tree node stores multiple sorted keys.

Example:

[10 20 30 40]

Those keys divide the search space into multiple child ranges:

<10
10..20
20..30
30..40
>40

A node with m keys can have up to m + 1 children.

This high fanout makes the tree shallow.

A small B-tree might look like this:

              [30 60]
             /   |   \\n     [10 20]  [40 50]  [70 80 90]

Searching still follows ordered comparisons, but each node narrows the range much more aggressively than a binary node.

Node Structure

A conceptual Go representation:

type BTreeNode struct {
	Keys     []int
	Children []*BTreeNode
	Leaf     bool
}

For a map, each key usually has an associated value or pointer to a record.

type Entry struct {
	Key   int
	Value string
}

Production systems also store metadata such as page identifiers, record offsets, sibling links, checksums, and concurrency-control fields.

Searching Inside a Node

Within a node, keys are sorted.

[10 20 30 40]

To search for 25, locate the first key greater than or equal to 25.

10 < 25
20 < 25
30 > 25

The search moves to the child between 20 and 30.

This internal search can be linear for small nodes or binary for larger nodes.

func LowerBound(keys []int, target int) int {
	lo, hi := 0, len(keys)

	for lo < hi {
		mid := lo + (hi-lo)/2

		if keys[mid] >= target {
			hi = mid
		} else {
			lo = mid + 1
		}
	}

	return lo
}

B-tree search is therefore binary search inside nodes plus tree descent between nodes.

Search Algorithm

func Search(node *BTreeNode, key int) bool {
	i := LowerBound(node.Keys, key)

	if i < len(node.Keys) && node.Keys[i] == key {
		return true
	}

	if node.Leaf {
		return false
	}

	return Search(node.Children[i], key)
}

The index i identifies either the matching key or the child range where the key must reside.

This is the same boundary logic used throughout this chapter.

Why B-Trees Are Shallow

Suppose each node stores up to 255 keys.

Each internal node may have up to 256 children.

A height-4 tree can address roughly:

256⁴

leaf regions.

That is more than four billion leaf positions.

The exact capacity depends on page size, key size, pointer size, and fill factor, but the principle is consistent: high fanout reduces height.

For storage systems, reducing height means reducing random I/O.

Minimum Degree

Many textbook B-tree definitions use a minimum degree t.

A B-tree with minimum degree t has these properties:

  • Every node except the root has at least t - 1 keys.
  • Every node has at most 2t - 1 keys.
  • Every internal node with k keys has k + 1 children.
  • All leaves appear at the same depth.

The root is allowed to have fewer keys.

These rules keep the tree balanced and prevent nodes from becoming too empty or too full.

Insertion Overview

Insertion begins by descending to the leaf where the key belongs.

If the leaf has space, insert the key into sorted position.

Example:

Before:
[10 20 40]

Insert:
30

After:
[10 20 30 40]

If the node is full, split it.

Splitting a Full Node

Suppose a node can hold at most five keys.

[10 20 30 40 50]

Insert:

60

Temporary overflow:

[10 20 30 40 50 60]

Split around the median:

Left:   [10 20]
Middle: 30
Right:  [40 50 60]

Promote the middle key to the parent.

        [30]
       /    \\n [10 20]  [40 50 60]

This keeps nodes within size limits and preserves sorted order.

Root Splitting

If the root becomes full and splits, the tree height increases by one.

Example:

[10 20 30 40 50]

Split:

      [30]
     /    \\n[10 20] [40 50]

This is the only way a B-tree grows taller.

Because splitting happens upward and height increases only at the root, all leaves remain at the same depth.

Deletion Overview

Deletion is more involved than insertion.

When removing a key, the tree must maintain minimum occupancy.

If a node becomes too small, the algorithm may:

  • Borrow a key from a sibling
  • Merge with a sibling
  • Pull a separator key down from the parent

These repairs preserve the B-tree invariants.

Database implementations often use variants that delay merging or use different fill-factor policies to reduce write amplification.

Range Scans

B-trees are excellent for range queries.

Query:

WHERE created_at BETWEEN '09:00' AND '10:00'

The tree locates the lower bound for 09:00.

Then it scans forward through adjacent keys.

Many B-tree variants link leaf pages together so range scans can proceed sequentially.

[09:00 09:10] -> [09:20 09:30] -> [09:40 09:50]

Sequential reads are much cheaper than random reads in storage systems.

This is one reason B-trees dominate database indexing.

B-Tree Versus Binary Search Tree

Feature BST B-Tree
Keys per node 1 Many
Children per node 2 Many
Height Higher Much lower
Memory locality Poor to moderate Excellent per node
Disk friendliness Poor Excellent
Range scans Good Excellent
Implementation complexity Moderate Higher

A BST minimizes comparisons conceptually.

A B-tree minimizes expensive node accesses.

The correct structure depends on the hardware cost model.

B-Tree Versus Sorted Array

A sorted array gives excellent cache locality and fast binary search.

Insertion is expensive because shifting may require moving many values.

A B-tree keeps values sorted while localizing updates to a small number of nodes.

Operation Sorted Array B-Tree
Search O(log n) O(log n)
Insert O(n) O(log n)
Delete O(n) O(log n)
Range scan Excellent Excellent
Large dynamic data Poor Excellent

B-trees are preferable when the collection is large and frequently updated.

B+ Trees

Many databases use B+ trees rather than plain B-trees.

In a B+ tree:

  • Internal nodes store separator keys.
  • Actual records or record pointers live in leaves.
  • Leaf nodes are linked for fast sequential scans.

Conceptually:

          [30 60]
         /   |   \\n      leaf leaf leaf
       |    |    |
       v    v    v

This layout improves range-query performance and keeps internal nodes small.

A smaller internal node means higher fanout and lower tree height.

Storage Page Perspective

B-tree nodes often map directly to storage pages.

Example:

Page size: 4096 bytes
Key size: 16 bytes
Pointer size: 8 bytes

A single page can hold many keys and child pointers.

Searching one page reads many keys at once.

That is the central storage insight behind B-trees: use each expensive I/O operation to eliminate a large portion of the search space.

Complexity Analysis

Let n be the number of keys and b be the branching factor.

Tree height is:

O(log_b n)

Search cost:

O(log_b n)

node visits.

Inside each node, searching may cost:

O(log b)

comparisons with binary search.

Total comparison cost is still logarithmic, but practical performance is dominated by node visits and cache behavior.

Operation Complexity
Search O(log n)
Insert O(log n)
Delete O(log n)
Range scan O(log n + k)

where k is the number of returned keys.

Common Mistakes

A common mistake is treating B-trees as binary search trees with larger nodes. The cost model is different. B-trees are designed to reduce page reads and improve locality.

Another mistake is ignoring minimum occupancy. Without split, borrow, and merge rules, the tree can lose its height and space guarantees.

A third mistake is overlooking range scans. B-trees are not only point-lookup structures. Their ordered leaves make them powerful range-query engines.

Finally, many implementations confuse B-trees and B+ trees. In most database discussions, the practical structure is often closer to a B+ tree.

Takeaway

B-trees generalize binary search trees for storage and cache efficiency. By storing many keys per node and branching many ways, they keep height small and minimize expensive node accesses. Their support for ordered lookup, insertion, deletion, and range scanning makes them the dominant index structure in databases, filesystems, and large key-value stores. Where binary search trees optimize pointer-based memory navigation, B-trees optimize block-based access.