7.24 Search Structures in Databases
Most discussions of search algorithms focus on in-memory collections.
7.24 Search Structures in Databases
Most discussions of search algorithms focus on in-memory collections. Arrays, trees, heaps, and hash tables are useful abstractions, but real-world data often resides on storage devices, across networks, or in distributed systems.
Database search structures are designed around a different set of constraints:
- Disk and SSD access latency
- Cache efficiency
- Concurrency
- Durability
- Range scans
- Large datasets
As a result, database indexes often look different from textbook search structures, even though many of the underlying principles remain the same.
Understanding these structures provides valuable insight into how large-scale search systems operate.
Problem
Suppose a table contains:
1,000,000,000 rows
Query:
SELECT *
FROM orders
WHERE order_id = 123456
A linear scan requires examining every row.
Complexity:
O(n)
Even on modern hardware, this is prohibitively expensive.
The database needs a structure that can locate records quickly without reading the entire table.
Solution
Build an index.
Conceptually:
Key
↓
Location
Instead of searching table rows directly, the query first searches the index.
The index identifies where the matching record resides.
Most database performance depends on the quality of this search structure.
The Simplest Index
Suppose records are ordered by:
order_id
Example:
100
120
150
200
240
300
Binary search becomes possible.
O(log n)
However, maintaining a sorted array on disk is difficult because insertions require moving large amounts of data.
Databases therefore use more sophisticated structures.
Why B-Trees Dominate
The most common database index is the B-tree or B+ tree.
Example:
[150 240]
/ | \\n Page1 Page2 Page3
Each node stores many keys.
Each node typically corresponds to a storage page.
A single disk read retrieves many keys simultaneously.
This dramatically reduces tree height.
Instead of:
30
levels for a binary tree, a B-tree may require only:
3
or:
4
page accesses.
This difference is enormous when storage latency dominates execution time.
Clustered Indexes
Some databases organize table data directly according to the primary index.
Example:
CustomerID
ordered physically.
This is called a clustered index.
Conceptually:
Index
=
Table
The leaf pages contain actual rows.
Example:
1001 -> Alice
1002 -> Bob
1003 -> Charlie
Searching the index immediately finds the row.
No secondary lookup is required.
Secondary Indexes
A secondary index stores references to rows.
Example:
Email
index:
[email protected]
↓
Row Location
[email protected]
↓
Row Location
The index search identifies the row location.
A second lookup retrieves the actual row.
This process is often called:
Index Lookup
or:
Bookmark Lookup
depending on the database system.
Range Queries
One of the biggest advantages of ordered indexes is efficient range scanning.
Query:
SELECT *
FROM orders
WHERE order_date
BETWEEN '2025-01-01'
AND '2025-01-31'
The index performs:
LowerBound(start)
followed by:
Sequential Scan
until:
end
This is exactly the interval-search pattern discussed earlier.
B-trees excel at this workload.
Why Hash Indexes Are Different
Hash indexes use:
hash(key)
to locate records.
Example:
hash(123)
→ bucket 42
Lookup is extremely fast.
However:
WHERE id BETWEEN 100 AND 200
cannot use a hash index efficiently.
Hashes destroy ordering.
The database would need to examine many buckets.
For this reason:
Hash Index
is excellent for:
=
queries.
Poor for:
<
>
BETWEEN
ORDER BY
queries.
B-Tree versus Hash Index
| Feature | B-Tree | Hash |
|---|---|---|
| Equality search | Excellent | Excellent |
| Range search | Excellent | Poor |
| Ordered traversal | Yes | No |
| Prefix search | Yes | No |
| Min/Max lookup | Efficient | Inefficient |
| Sorting support | Yes | No |
This explains why B-trees remain the default choice in most relational databases.
Covering Indexes
Consider:
SELECT name
FROM users
WHERE email = ?
Index:
(email)
The database finds matching rows but must still access the table.
Now consider:
(email, name)
stored inside the index.
The query can be answered entirely from index pages.
No table lookup is needed.
This is called a:
Covering Index
Covering indexes reduce I/O substantially.
Composite Indexes
Indexes may contain multiple columns.
Example:
(country, city)
Sorted order:
USA, Austin
USA, Boston
USA, Seattle
Canada, Calgary
Canada, Toronto
Queries using:
country
benefit.
Queries using:
country + city
benefit.
Queries using:
city
alone often do not.
This is known as the:
Leftmost Prefix Rule
Understanding index ordering is critical for query optimization.
Index Selectivity
Not all indexes are useful.
Consider:
Gender
with values:
Male
Female
An index provides little filtering power.
The database may decide that a table scan is cheaper.
By contrast:
Email Address
is highly selective.
Most searches return one row.
Highly selective columns generally produce more valuable indexes.
Index Cardinality
Cardinality refers to the number of distinct values.
Examples:
UserID
High cardinality.
Country
Lower cardinality.
Boolean Flag
Very low cardinality.
Higher cardinality often leads to better index performance because fewer rows match a given value.
Full Table Scans
Indexes are not always beneficial.
Suppose:
SELECT *
FROM orders
The query needs every row.
An index adds unnecessary work.
The optimizer chooses:
Table Scan
instead.
This surprises many developers.
Indexes help when they eliminate work.
If most rows are needed anyway, scanning may be faster.
Query Optimizers
Modern databases contain query optimizers.
The optimizer estimates:
Rows Returned
Index Selectivity
I/O Cost
CPU Cost
and chooses an execution plan.
Possible plans:
Index Seek
Index Scan
Table Scan
Hash Join
Merge Join
Nested Loop
The optimizer is essentially a search algorithm operating on possible execution strategies.
Bitmap Indexes
Analytical databases sometimes use bitmap indexes.
Example:
Country = USA
Bit vector:
101101001...
Country = Canada
Bit vector:
010010110...
Logical operations:
AND
OR
NOT
combine filters efficiently.
Bitmap indexes work particularly well for low-cardinality columns and analytical workloads.
LSM Trees
Many modern databases use Log-Structured Merge Trees.
Examples include:
- entity["software","RocksDB","embedded key-value store"]
- entity["software","LevelDB","embedded key-value store"]
- entity["software","Apache Cassandra","distributed database"]
Instead of updating B-tree pages directly:
- Write to memory.
- Flush sorted files.
- Merge files later.
Searches may consult:
Memory
Level 0
Level 1
Level 2
...
This design optimizes write throughput.
The trade-off is more complex search logic.
Search in LSM Trees
Lookup process:
Check memory table
Check newest file
Check older files
Continue until found
Bloom filters often accelerate this process by ruling out files that cannot contain the key.
Without Bloom filters, many unnecessary searches would occur.
Bloom Filters in Databases
A Bloom filter answers:
Definitely not present
or:
Possibly present
Before opening a storage file, the database checks its Bloom filter.
If the filter says:
No
the file is skipped entirely.
This saves substantial I/O.
Bloom filters are a common companion to LSM-tree systems.
Search Structures in Distributed Databases
Distributed systems add another layer.
Example:
Hash(key)
determines:
Shard
Then:
Local Index Search
occurs within that shard.
Search becomes:
Locate machine
Locate partition
Locate record
The principles remain familiar, but the search space now spans multiple servers.
Search Cost Hierarchy
Not all operations have equal cost.
Typical order:
CPU cache access
↓
RAM access
↓
SSD access
↓
Network access
Database search structures are designed to minimize the most expensive operations.
This explains many design decisions that appear unusual from a purely algorithmic perspective.
Why Search Structures Differ from Textbooks
Textbook search structures often optimize:
Comparison count
Databases optimize:
I/O operations
Cache misses
Network round trips
A theoretically slower algorithm may outperform a theoretically faster one if it reduces storage access.
Practical performance depends heavily on the hardware environment.
Real-World Examples
Common search structures:
| System | Primary Search Structure |
|---|---|
| PostgreSQL | B+ Tree |
| MySQL InnoDB | B+ Tree |
| SQLite | B-Tree |
| Oracle Database | B-Tree |
| SQL Server | B+ Tree |
| RocksDB | LSM Tree |
| LevelDB | LSM Tree |
| Cassandra | LSM Tree |
| Redis Sorted Sets | Skip List |
| Elasticsearch | Inverted Index |
Different workloads favor different search structures.
Complexity Analysis
For B-tree indexes:
| Operation | Complexity |
|---|---|
| Lookup | O(log n) |
| Insert | O(log n) |
| Delete | O(log n) |
| Range Scan | O(log n + k) |
For hash indexes:
| Operation | Complexity |
|---|---|
| Equality Search | O(1) average |
| Range Search | O(n) |
For LSM trees:
| Operation | Complexity |
|---|---|
| Lookup | O(log n) expected |
| Insert | Very efficient |
| Range Scan | Good |
Actual performance depends more on I/O patterns than asymptotic complexity.
Common Mistakes
A common mistake is assuming every query should use an index. For large result sets, table scans may be faster.
Another mistake is creating indexes on low-cardinality columns that provide little filtering power.
A third mistake is choosing hash indexes when range queries are important.
Finally, many developers focus exclusively on algorithmic complexity while ignoring storage behavior. In database systems, reducing I/O often matters more than reducing comparisons.
Takeaway
Database search structures adapt classical search techniques to the realities of storage systems. B-trees dominate transactional databases because they combine logarithmic search with efficient range scans and excellent page utilization. Hash indexes optimize exact lookups. LSM trees optimize write-heavy workloads. Bloom filters eliminate unnecessary searches, while query optimizers choose among competing access paths. Although the structures differ from textbook examples, the underlying principles remain familiar: partition the search space, maintain useful summaries, and eliminate as much work as possible before touching the underlying data.