7.22 Sparse Tables
Many range-query structures support updates.
7.22 Sparse Tables
Many range-query structures support updates. Segment trees allow values to change. Fenwick trees support dynamic sums. Balanced trees continuously adjust their structure.
Sometimes updates never occur.
The data is loaded once, queried thousands or millions of times, and never modified.
Examples include:
- Historical datasets
- Compiled lookup tables
- Immutable logs
- Static geographic data
- Read-only analytics snapshots
- Competitive programming query sets
When updates disappear from the problem, we can precompute far more information than a dynamic structure would allow.
Sparse tables exploit this fact.
They provide extremely fast range queries on immutable data by precomputing answers for intervals whose lengths are powers of two.
Problem
Given an array:
[7, 2, 5, 3, 6, 1, 8]
support queries such as:
Minimum from index 1 to index 5
Maximum from index 2 to index 6
GCD from index 0 to index 4
The array never changes.
A linear scan requires:
O(n)
per query.
For millions of queries, this becomes expensive.
We need a structure optimized specifically for static data.
Solution
Precompute answers for ranges of length:
1
2
4
8
16
...
These intervals can be combined to answer larger queries efficiently.
For minimum queries, the combination operation is:
min(a, b)
Because minimum is associative and idempotent, sparse tables can answer queries in constant time.
This is one of the fastest range-query techniques available for immutable data.
Core Idea
Suppose:
nums = [7, 2, 5, 3, 6, 1, 8]
Store:
Length 1 intervals
Length 2 intervals
Length 4 intervals
Length 8 intervals
...
Example:
Length 1
[7]
[2]
[5]
[3]
[6]
[1]
[8]
Length 2:
min(7,2) = 2
min(2,5) = 2
min(5,3) = 3
min(3,6) = 3
min(6,1) = 1
min(1,8) = 1
Length 4:
min(2,3) = 2
min(2,3) = 2
min(3,1) = 1
min(1,1) = 1
The table stores answers for increasingly larger intervals.
Sparse Table Structure
A typical implementation:
type SparseTable struct {
table [][]int
log []int
}
The table dimensions:
Level 0 -> length 1
Level 1 -> length 2
Level 2 -> length 4
Level 3 -> length 8
...
Each level doubles the interval size.
Precomputing Logarithms
Sparse-table queries need:
floor(log2(length))
Precompute:
func BuildLogs(n int) []int {
log := make([]int, n+1)
for i := 2; i <= n; i++ {
log[i] = log[i/2] + 1
}
return log
}
Example:
log[1] = 0
log[2] = 1
log[4] = 2
log[8] = 3
This allows constant-time level selection during queries.
Building the Table
Level zero stores the original values.
func NewSparseTable(
nums []int,
) *SparseTable {
n := len(nums)
log := BuildLogs(n)
k := log[n] + 1
table := make([][]int, k)
for i := range table {
table[i] = make([]int, n)
}
copy(table[0], nums)
Now compute larger intervals.
for level := 1;
level < k;
level++ {
length := 1 << level
for i := 0;
i+length <= n;
i++ {
table[level][i] =
min(
table[level-1][i],
table[level-1][
i+(length>>1),
],
)
}
}
return &SparseTable{
table: table,
log: log,
}
}
Each interval is built from two half-sized intervals.
Understanding Construction
Length:
4
interval:
[i, i+3]
splits into:
[i, i+1]
and:
[i+2, i+3]
The precomputed answers for those halves already exist.
Therefore:
min(length 4)
=
min(
left half,
right half
)
Construction proceeds level by level.
Example Table
Array:
[7, 2, 5, 3]
Level 0:
7 2 5 3
Level 1:
2 2 3
Level 2:
2
Visualized:
Level 0 (1):
7 2 5 3
Level 1 (2):
2 2 3
Level 2 (4):
2
The structure stores answers for every power-of-two interval.
Querying
Suppose:
Query:
[1,5]
Length:
5
Largest power of two:
4
Use:
k = floor(log2(5))
which equals:
2
Two intervals of length four cover the query:
[1,4]
and:
[2,5]
Answer:
min(
table[2][1],
table[2][2]
)
This works because minimum is idempotent.
Overlapping regions do not matter.
Query Implementation
func (st *SparseTable) Query(
left int,
right int,
) int {
length := right - left + 1
k := st.log[length]
return min(
st.table[k][left],
st.table[k][
right-(1<<k)+1,
],
)
}
No recursion.
No tree traversal.
No loops.
Just a few array accesses.
Why Queries Are O(1)
Consider:
Length = 13
Largest power of two:
8
Take:
First interval:
8 elements
and:
Last interval:
8 elements
These two intervals cover the query.
The minimum of those precomputed answers is the result.
Regardless of array size:
2 lookups
1 comparison
Constant time.
Range Maximum Query
Replace:
min(a, b)
with:
max(a, b)
Construction remains identical.
Queries remain:
O(1)
This makes sparse tables ideal for range maximum queries as well.
Range GCD Query
The greatest common divisor is associative and idempotent.
Replace:
min(a, b)
with:
gcd(a, b)
Queries remain:
O(1)
This capability appears frequently in number-theory problems.
Why Sum Queries Are Different
Minimum works because:
min(x, x) = x
Duplicate coverage is harmless.
Sum does not have this property.
sum(x, x) = 2x
Overlapping intervals would double-count.
For sums, sparse tables typically answer queries in:
O(log n)
by decomposing the range into non-overlapping power-of-two intervals.
They are still useful, but they lose their constant-time advantage.
Sparse Tables Versus Segment Trees
| Feature | Sparse Table | Segment Tree |
|---|---|---|
| Updates | No | Yes |
| Build | O(n log n) | O(n) |
| Query (RMQ) | O(1) | O(log n) |
| Query (sum) | O(log n) | O(log n) |
| Space | O(n log n) | O(n) |
Sparse tables sacrifice updates for faster queries.
Segment trees remain dynamic.
Sparse Tables Versus Prefix Sums
| Feature | Prefix Sum | Sparse Table |
|---|---|---|
| Range Sum | O(1) | O(log n) |
| Range Min | Poor | O(1) |
| Updates | Poor | None |
| Space | O(n) | O(n log n) |
Prefix sums specialize in sums.
Sparse tables specialize in immutable associative queries.
Lowest Common Ancestor
One of the most important applications of sparse tables is the Lowest Common Ancestor (LCA) problem.
After an Euler tour of a tree, LCA reduces to a range minimum query.
Sparse tables then provide:
O(1)
LCA queries after preprocessing.
This technique appears frequently in graph algorithms.
Real-World Applications
Sparse tables are useful when:
- Data is immutable
- Queries greatly outnumber construction costs
- Fast minimum or maximum queries matter
Examples include:
- Geographic elevation datasets
- Read-only analytics cubes
- Compiled dictionaries
- Scientific simulation outputs
- Competitive programming problems
The structure is less common in continuously updated systems because updates require rebuilding.
Complexity Analysis
For:
n elements
Build:
O(n log n)
Space:
O(n log n)
Range minimum query:
O(1)
Range maximum query:
O(1)
Range GCD query:
O(1)
The build cost is higher than a segment tree, but query speed is optimal.
Common Mistakes
A common mistake is using sparse tables for mutable data. Any update requires rebuilding affected entries, making the structure unsuitable for dynamic workloads.
Another mistake is assuming all operations support constant-time queries. The operation must be associative and typically idempotent.
A third mistake is computing logarithms repeatedly during queries. Precomputed logarithm tables are essential.
Finally, many implementations allocate too little storage. Sparse tables require approximately:
n log₂(n)
entries, which can become significant for large datasets.
Takeaway
Sparse tables trade update capability for extremely fast query performance. By precomputing answers for power-of-two intervals, they answer range minimum, maximum, and GCD queries in constant time after an O(n log n) preprocessing phase. When data is immutable and query volume is high, sparse tables are often the fastest practical solution for range-query problems. Their design illustrates a common algorithmic theme: when updates disappear, aggressive precomputation can transform logarithmic queries into constant-time lookups.