Array tiling divides a multidimensional array into small rectangular regions called tiles. Each tile is processed before moving to the next tile.
You use it when matrix or grid operations touch nearby elements repeatedly and cache locality affects performance.
Problem
Given a matrix with rows and columns, process all elements in tiles of size:
where is the tile height and is the tile width.
Algorithm
Scan a matrix by tiles.
tile_scan(A, rows, cols, tile_rows, tile_cols):
if tile_rows <= 0 or tile_cols <= 0:
error "tile size must be positive"
for row_start from 0 to rows - 1 step tile_rows:
row_end = min(row_start + tile_rows, rows)
for col_start from 0 to cols - 1 step tile_cols:
col_end = min(col_start + tile_cols, cols)
for i from row_start to row_end - 1:
for j from col_start to col_end - 1:
process A[i][j]Example
Let a matrix have rows and columns. Use tiles of size .
| tile | row range | column range |
|---|---|---|
| 1 | ||
| 2 | ||
| 3 | ||
| 4 |
Each tile is processed as a small local region.
Correctness
The outer row loop partitions the row range into non-overlapping intervals. The outer column loop partitions the column range in the same way. Their Cartesian product forms non-overlapping tiles that cover the whole matrix.
Inside each tile, the nested loops visit every element in that tile exactly once. Therefore, every matrix element is processed exactly once.
Complexity
| operation | time |
|---|---|
| tile scan |
Space usage:
for direct tile scanning.
If an algorithm copies each tile into a temporary buffer, extra space is:
When to Use
Array tiling is appropriate when:
- processing matrices or grids
- repeated access to nearby cells matters
- cache locality affects runtime
- tiles can be processed independently
It is less suitable when:
- the array is small
- each element is touched only once
- access patterns are already contiguous and cache friendly
Implementation
def tile_scan(matrix, tile_rows, tile_cols, process):
if tile_rows <= 0 or tile_cols <= 0:
raise ValueError("tile size must be positive")
rows = len(matrix)
cols = len(matrix[0]) if rows else 0
for row_start in range(0, rows, tile_rows):
row_end = min(row_start + tile_rows, rows)
for col_start in range(0, cols, tile_cols):
col_end = min(col_start + tile_cols, cols)
for i in range(row_start, row_end):
for j in range(col_start, col_end):
process(matrix[i][j])func TileScan[T any](matrix [][]T, tileRows, tileCols int, process func(T)) bool {
if tileRows <= 0 || tileCols <= 0 {
return false
}
rows := len(matrix)
if rows == 0 {
return true
}
cols := len(matrix[0])
for rowStart := 0; rowStart < rows; rowStart += tileRows {
rowEnd := rowStart + tileRows
if rowEnd > rows {
rowEnd = rows
}
for colStart := 0; colStart < cols; colStart += tileCols {
colEnd := colStart + tileCols
if colEnd > cols {
colEnd = cols
}
for i := rowStart; i < rowEnd; i++ {
for j := colStart; j < colEnd; j++ {
process(matrix[i][j])
}
}
}
}
return true
}