Array blocking divides an array into contiguous chunks and processes one chunk at a time. This can improve cache locality because nearby elements are reused before moving to the next block.
You use it when large arrays exceed cache capacity or when work should be processed in batches.
Problem
Given an array of length and a block size , process the array in ranges:
until all elements are processed.
Algorithm
Scan the array by blocks.
block_scan(A, b):
if b <= 0:
error "block size must be positive"
n = length(A)
for start from 0 to n - 1 step b:
end = min(start + b, n)
for i from start to end - 1:
process A[i]Example
Let
and
| block | range | values |
|---|---|---|
| 1 | [8, 3, 5] | |
| 2 | [1, 9, 4] | |
| 3 | [2] |
Each block is processed before moving to the next one.
Correctness
The outer loop chooses block starts at multiples of . For each start, the end index is capped at , so the last block handles the remaining elements safely.
The inner loop visits every index in the current block. Since the blocks are contiguous, non-overlapping, and cover the range , every element is processed exactly once.
Complexity
| operation | time |
|---|---|
| block scan |
Space usage:
for simple block scanning.
When to Use
Array blocking is appropriate when:
- arrays are large
- cache locality matters
- processing can be batched
- each block can be handled independently
It is less suitable when:
- the algorithm needs global random access at each step
- the best block size is hard to estimate
- overhead from nested loops dominates small inputs
Implementation
def block_scan(a, block_size, process):
if block_size <= 0:
raise ValueError("block size must be positive")
n = len(a)
for start in range(0, n, block_size):
end = min(start + block_size, n)
for i in range(start, end):
process(a[i])func BlockScan[T any](a []T, blockSize int, process func(T)) bool {
if blockSize <= 0 {
return false
}
for start := 0; start < len(a); start += blockSize {
end := start + blockSize
if end > len(a) {
end = len(a)
}
for i := start; i < end; i++ {
process(a[i])
}
}
return true
}