# Chapter 95. Sparse Matrices

# Chapter 95. Sparse Matrices

A sparse matrix is a matrix in which most entries are zero. Sparse matrices occur when a system has many variables but only a small number of direct interactions between them.

For example, a finite difference model on a grid usually couples each grid point only to its nearest neighbors. A graph adjacency matrix usually stores only existing edges. A finite element stiffness matrix records local interactions between basis functions. In all of these cases, storing every zero entry wastes memory and computation.

Sparse linear algebra studies how to store, multiply, factor, reorder, and solve with such matrices efficiently. Specialized sparse representations store only nonzero entries, and sparse algorithms are designed to avoid unnecessary arithmetic with zeros.

## 95.1 Dense and Sparse Matrices

A dense matrix is stored as a full rectangular array. An \(m\times n\) dense matrix stores

$$
mn
$$

entries.

A sparse matrix stores mainly its nonzero entries. If a matrix has

$$
\operatorname{nnz}(A)
$$

nonzero entries, then sparse storage aims to use memory proportional to

$$
\operatorname{nnz}(A),
$$

plus indexing overhead.

The symbol

$$
\operatorname{nnz}(A)
$$

means the number of nonzero entries of \(A\).

A matrix is useful to treat as sparse when

$$
\operatorname{nnz}(A) \ll mn.
$$

The exact boundary is practical, not purely mathematical. It depends on matrix size, storage format, hardware, and the operations to be performed.

## 95.2 Examples

A diagonal matrix is sparse:

$$
D=
\begin{bmatrix}
d_1 & 0 & 0 & \cdots & 0\\
0 & d_2 & 0 & \cdots & 0\\
0 & 0 & d_3 & \cdots & 0\\
\vdots & \vdots & \vdots & \ddots & \vdots\\
0 & 0 & 0 & \cdots & d_n
\end{bmatrix}.
$$

It has only \(n\) nonzero entries out of \(n^2\).

A tridiagonal matrix is also sparse:

$$
T=
\begin{bmatrix}
d_1 & u_1 & 0 & \cdots & 0\\
\ell_1 & d_2 & u_2 & \cdots & 0\\
0 & \ell_2 & d_3 & \cdots & 0\\
\vdots & \vdots & \vdots & \ddots & u_{n-1}\\
0 & 0 & 0 & \ell_{n-1} & d_n
\end{bmatrix}.
$$

It has at most

$$
3n-2
$$

nonzero entries.

For large \(n\), this is far smaller than \(n^2\).

## 95.3 Sparsity Pattern

The sparsity pattern of a matrix records which entries are structurally nonzero.

For a matrix \(A\), the sparsity pattern is the set

$$
\mathcal{S}(A) =
\{(i,j): a_{ij}\ne 0\}.
$$

The numerical values may change, but the pattern may remain fixed.

This distinction is important. Many sparse algorithms first analyze the pattern, then perform numerical computation. For example, sparse direct solvers often separate symbolic factorization from numerical factorization.

| Phase | Purpose |
|---|---|
| Symbolic phase | analyze sparsity, ordering, fill-in, memory |
| Numerical phase | compute actual numerical factors |

The symbolic phase depends mostly on the zero-nonzero structure. The numerical phase depends on the values.

## 95.4 Density and Sparsity

The density of a matrix is

$$
\frac{\operatorname{nnz}(A)}{mn}.
$$

The sparsity is often described as

$$
1-\frac{\operatorname{nnz}(A)}{mn}.
$$

For example, if a \(100000\times 100000\) matrix has \(500000\) nonzero entries, then its density is

$$
\frac{500000}{10^{10}}=5\times 10^{-5}.
$$

This matrix is extremely sparse.

A dense storage format would store \(10^{10}\) entries. Sparse storage would store roughly \(500000\) values plus indices.

## 95.5 Why Sparse Matrices Matter

Sparse matrices matter because dense methods often become impossible for large problems.

A dense \(n\times n\) matrix needs memory proportional to

$$
n^2.
$$

Dense factorization usually costs proportional to

$$
n^3.
$$

For \(n=10^6\), dense storage is not practical.

Sparse methods exploit structure. If each row has only a fixed number of nonzero entries, then

$$
\operatorname{nnz}(A)=O(n).
$$

A matrix-vector product may then cost

$$
O(n)
$$

instead of

$$
O(n^2).
$$

This difference is decisive in large-scale computation.

## 95.6 Common Sources of Sparse Matrices

Sparse matrices arise naturally in many areas.

| Source | Matrix |
|---|---|
| Graph theory | adjacency matrix, Laplacian matrix |
| Finite differences | discretized differential operators |
| Finite elements | stiffness and mass matrices |
| Optimization | sparse Hessians and Jacobians |
| Machine learning | sparse feature matrices |
| Networks | incidence and transition matrices |
| Recommender systems | user-item interaction matrices |
| Text analysis | document-term matrices |

The common feature is locality or limited interaction.

Most variables interact with only a few other variables.

## 95.7 Graph Interpretation

A sparse matrix can be interpreted as a graph.

For a square matrix \(A\), define a directed graph with vertices

$$
1,2,\ldots,n.
$$

There is an edge

$$
i\to j
$$

when

$$
a_{ij}\ne 0.
$$

For a symmetric matrix, the graph is undirected because

$$
a_{ij}\ne 0
$$

implies

$$
a_{ji}\ne 0.
$$

This graph viewpoint is useful for reordering, fill-in analysis, graph Laplacians, and sparse factorization.

## 95.8 Coordinate Format

The coordinate format, or COO format, stores a sparse matrix as triples:

$$
(i,j,a_{ij}).
$$

For example,

$$
A=
\begin{bmatrix}
10 & 0 & 0\\
0 & 0 & 5\\
2 & 0 & 3
\end{bmatrix}
$$

can be stored as:

| Row | Column | Value |
|---:|---:|---:|
| 1 | 1 | 10 |
| 2 | 3 | 5 |
| 3 | 1 | 2 |
| 3 | 3 | 3 |

COO is simple and convenient for construction.

It is less efficient for repeated arithmetic because entries must often be sorted or compressed before fast operations.

## 95.9 Compressed Sparse Row Format

Compressed sparse row format, or CSR, stores a sparse matrix row by row using three arrays:

| Array | Meaning |
|---|---|
| `values` | nonzero values |
| `col_index` | column index for each value |
| `row_ptr` | starting position of each row |

CSR is one of the most common sparse formats. It supports fast row access and efficient matrix-vector multiplication. Standard descriptions of CSR use three one-dimensional arrays: values, column indices, and row pointers.

For the matrix

$$
A=
\begin{bmatrix}
10 & 0 & 20 & 0\\
0 & 30 & 0 & 40\\
0 & 0 & 50 & 0
\end{bmatrix},
$$

CSR storage is:

$$
\text{values}=[10,20,30,40,50],
$$

$$
\text{col\_index}=[1,3,2,4,3],
$$

$$
\text{row\_ptr}=[1,3,5,6].
$$

Using zero-based indexing, the same data would be

$$
\text{col\_index}=[0,2,1,3,2],
$$

$$
\text{row\_ptr}=[0,2,4,5].
$$

The row pointer tells where each row begins and ends in the values array.

## 95.10 Reading a CSR Row

In zero-based indexing, row \(i\) occupies entries

$$
\text{row\_ptr}[i],
\text{row\_ptr}[i]+1,
\ldots,
\text{row\_ptr}[i+1]-1.
$$

Thus the nonzeros in row \(i\) are stored in the slice

$$
\text{values}[\text{row\_ptr}[i] : \text{row\_ptr}[i+1]].
$$

The corresponding columns are stored in

$$
\text{col\_index}[\text{row\_ptr}[i] : \text{row\_ptr}[i+1]].
$$

This compact row access is why CSR is well suited to matrix-vector multiplication.

## 95.11 Compressed Sparse Column Format

Compressed sparse column format, or CSC, is the column-oriented analogue of CSR.

It stores:

| Array | Meaning |
|---|---|
| `values` | nonzero values |
| `row_index` | row index for each value |
| `col_ptr` | starting position of each column |

CSC is efficient for column access.

It is commonly used in sparse direct solvers because column-oriented operations are natural in many factorizations. MATLAB's sparse format is traditionally column-oriented.

CSR and CSC store the same mathematical matrix, but they favor different operations.

| Format | Efficient access |
|---|---|
| CSR | rows |
| CSC | columns |

## 95.12 Other Sparse Formats

Several formats are used for different purposes.

| Format | Use |
|---|---|
| COO | easy construction from triples |
| CSR | row operations and matrix-vector products |
| CSC | column operations and direct solvers |
| DOK | incremental insertion by key |
| LIL | incremental row-wise construction |
| DIA | diagonal or banded matrices |
| ELL | regular row lengths, useful on some parallel hardware |
| BSR | block sparse matrices |

Formats such as DOK and LIL are convenient for building sparse matrices, while CSR and CSC are usually better for arithmetic.

## 95.13 Sparse Matrix-Vector Multiplication

The most important sparse operation is

$$
y=Ax.
$$

In CSR format, this is computed row by row:

```text
spmv_csr(values, col_index, row_ptr, x):
    m = length(row_ptr) - 1
    y = zero vector of length m

    for i = 0 to m - 1:
        s = 0

        for p = row_ptr[i] to row_ptr[i + 1] - 1:
            j = col_index[p]
            s = s + values[p] * x[j]

        y[i] = s

    return y
```

The cost is proportional to the number of stored nonzeros:

$$
O(\operatorname{nnz}(A)).
$$

This is the core operation in many iterative methods, including conjugate gradient, GMRES, Lanczos iteration, and power iteration.

## 95.14 Sparse Matrix Addition

To compute

$$
C=A+B,
$$

the nonzero patterns of \(A\) and \(B\) must be merged.

If both matrices are stored row-wise, then addition can be performed row by row.

For each row, merge the column indices from both matrices. When the same column occurs in both rows, add the values. When an entry appears in only one matrix, copy it.

The resulting pattern is contained in

$$
\mathcal{S}(A)\cup \mathcal{S}(B).
$$

Some numerical cancellations may produce zero values, which should often be removed.

## 95.15 Sparse Matrix-Matrix Multiplication

Sparse matrix-matrix multiplication computes

$$
C=AB.
$$

The sparsity pattern of \(C\) may be much larger than the patterns of \(A\) and \(B\).

Even if \(A\) and \(B\) are sparse, \(C\) may be dense.

This is one reason sparse matrix multiplication is more complicated than sparse matrix-vector multiplication.

The pattern is determined by paths of length two in the associated graph:

$$
c_{ij}\ne 0
$$

may occur when there exists \(k\) such that

$$
a_{ik}\ne 0
\qquad
\text{and}
\qquad
b_{kj}\ne 0.
$$

Efficient sparse multiplication must handle both symbolic pattern construction and numerical accumulation.

## 95.16 Fill-In

Fill-in refers to new nonzero entries that appear during factorization.

For example, a zero entry of \(A\) may become nonzero in an LU or Cholesky factor.

Sparse direct methods try to reduce fill-in because fill-in increases both memory and arithmetic cost. Reordering rows and columns is a common way to reduce fill-in.

Fill-in explains why a sparse matrix may still be difficult to factor.

The original matrix may be sparse, but its factors may be much denser.

## 95.17 Sparse LU Factorization

Sparse LU factorization writes

$$
PAQ=LU
$$

or

$$
PA=LU,
$$

depending on whether both row and column permutations are used.

Here:

| Symbol | Meaning |
|---|---|
| \(P\) | row permutation |
| \(Q\) | column permutation |
| \(L\) | lower triangular factor |
| \(U\) | upper triangular factor |

Sparse LU must handle two competing goals:

| Goal | Reason |
|---|---|
| numerical stability | requires pivoting |
| sparsity preservation | requires good ordering |

Pivoting may introduce fill-in. Avoiding fill-in too aggressively may harm stability.

## 95.18 Sparse Cholesky Factorization

If \(A\) is symmetric positive definite, sparse Cholesky factorization writes

$$
A=LL^T.
$$

This is often more efficient and stable than sparse LU because no pivoting is usually required for positive definite matrices.

Sparse Cholesky is common in finite element methods, Gaussian Markov random fields, optimization, and graph-based computations.

Its efficiency depends strongly on ordering.

A poor ordering can produce substantial fill-in.

## 95.19 Ordering and Permutation

A permutation reorders rows and columns.

For symmetric matrices, one often forms

$$
PAP^T.
$$

This changes the numbering of variables but not the underlying mathematical problem.

The purpose is to reduce fill-in and improve locality.

Common ordering strategies include:

| Ordering | Idea |
|---|---|
| Reverse Cuthill-McKee | reduce bandwidth |
| Minimum degree | eliminate low-degree vertices first |
| Approximate minimum degree | efficient minimum-degree heuristic |
| Nested dissection | split graph by separators |

Ordering is a major part of sparse direct solver performance.

## 95.20 Bandwidth

The bandwidth of a matrix measures how far nonzero entries lie from the diagonal.

The upper bandwidth is the largest value of

$$
j-i
$$

for a nonzero \(a_{ij}\).

The lower bandwidth is the largest value of

$$
i-j
$$

for a nonzero \(a_{ij}\).

A tridiagonal matrix has lower and upper bandwidth \(1\).

Banded matrices are sparse matrices with nonzeros concentrated near the diagonal. Specialized band solvers can be much faster than general sparse solvers.

## 95.21 Sparse Triangular Solves

Sparse direct methods produce triangular systems such as

$$
Lx=b
$$

or

$$
Ux=b.
$$

Solving these systems requires following dependency structure.

If \(L\) is sparse lower triangular, then \(x_i\) depends only on previously computed variables \(x_j\) for which

$$
\ell_{ij}\ne 0
\qquad
j<i.
$$

Sparse triangular solves are less parallel than sparse matrix-vector multiplication because dependencies impose an order.

They are often performance bottlenecks in sparse direct methods and preconditioners.

## 95.22 Iterative Methods with Sparse Matrices

Sparse matrices are especially important for iterative methods.

A Krylov method typically needs:

| Operation | Sparse cost |
|---|---:|
| Matrix-vector product \(Ax\) | \(O(\operatorname{nnz}(A))\) |
| Vector updates | \(O(n)\) |
| Dot products | \(O(n)\) |
| Preconditioner application | depends on preconditioner |

If \(\operatorname{nnz}(A)=O(n)\), then each iteration may cost \(O(n)\).

This makes iterative methods suitable for very large sparse systems.

## 95.23 Sparse Preconditioners

Preconditioners are often sparse approximations to \(A\) or \(A^{-1}\).

Common sparse preconditioners include:

| Preconditioner | Description |
|---|---|
| Jacobi | diagonal of \(A\) |
| block Jacobi | independent diagonal blocks |
| incomplete LU | approximate LU with controlled fill |
| incomplete Cholesky | approximate Cholesky for SPD matrices |
| SSOR | relaxation-based triangular approximation |
| algebraic multigrid | hierarchy built from sparse structure |

A good preconditioner reduces iteration count while remaining cheap to apply.

## 95.24 Incomplete Factorization

Incomplete factorization computes approximate factors while limiting fill-in.

For example, incomplete LU seeks

$$
A\approx LU,
$$

but permits nonzeros only in selected positions.

The simplest form, ILU(0), keeps the original sparsity pattern.

More advanced variants allow controlled fill-in or drop small entries.

Incomplete Cholesky applies the same idea to symmetric positive definite matrices.

These methods are widely used as preconditioners rather than exact solvers.

## 95.25 Sparse Direct Versus Iterative Solvers

Sparse systems can be solved by direct or iterative methods.

| Method type | Strength | Weakness |
|---|---|---|
| Sparse direct | robust, accurate, good for multiple right-hand sides | fill-in may be expensive |
| Iterative | memory-light, matrix-vector based | convergence depends on conditioning and preconditioning |
| Hybrid | uses sparse factorization as preconditioner | more complex |

The best choice depends on problem size, matrix structure, accuracy requirements, and number of right-hand sides.

## 95.26 Matrix-Free Sparse Operators

Sometimes a sparse matrix is not stored explicitly.

Instead, the computation provides a function that applies the operator:

$$
x\mapsto Ax.
$$

This matrix-free form is common when the matrix comes from a differential operator or a simulation.

Matrix-free methods save memory and may avoid assembly cost.

They are especially natural for Krylov methods, which only need matrix-vector products.

## 95.27 Block Sparse Matrices

Many sparse matrices have block structure.

For example, each node in a physical simulation may have several unknowns, such as velocity components or displacement coordinates.

A block sparse matrix stores small dense blocks instead of individual scalar entries.

Block sparse row format stores:

| Object | Meaning |
|---|---|
| dense blocks | nonzero matrix blocks |
| block column indices | column block positions |
| block row pointers | start of each block row |

Block storage reduces index overhead and can improve cache behavior.

## 95.28 Structural Symmetry

A matrix is structurally symmetric if

$$
a_{ij}\ne 0
$$

exactly when

$$
a_{ji}\ne 0.
$$

This is weaker than numerical symmetry.

Numerical symmetry requires

$$
a_{ij}=a_{ji}.
$$

Structural symmetry matters for graph algorithms and sparse storage.

A nonsymmetric matrix may still have a symmetric sparsity pattern.

## 95.29 Storage Tradeoffs

Sparse storage saves memory only when the savings from omitted zeros exceed the cost of indices.

For CSR with floating point values and integer column indices, each nonzero stores at least:

| Field | Typical size |
|---|---:|
| value | 8 bytes |
| column index | 4 or 8 bytes |

The row pointer stores \(m+1\) additional integers.

Thus sparse storage has overhead.

For small or moderately dense matrices, dense storage may be faster.

Sparse storage is most useful when the nonzero count is much smaller than the total number of entries.

## 95.30 Performance Considerations

Sparse computation is often limited by memory bandwidth.

Sparse matrix-vector multiplication performs few arithmetic operations per memory access. It must load values, indices, and vector entries.

Performance depends on:

| Factor | Effect |
|---|---|
| row length distribution | load balance |
| column index locality | cache behavior |
| format choice | access pattern |
| ordering | locality and fill-in |
| parallel partitioning | communication cost |
| block structure | better arithmetic intensity |

Sparse algorithms require attention to data layout as much as mathematical operation count.

## 95.31 Parallel Sparse Computation

Sparse matrix-vector multiplication is often parallelized by assigning rows to processors or threads.

However, parallel sparse computation faces several difficulties:

| Issue | Explanation |
|---|---|
| irregular row lengths | uneven work |
| indirect memory access | poor cache locality |
| communication | distributed vector entries needed |
| synchronization | reductions and dependencies |
| triangular solves | sequential dependencies |

Graph partitioning is often used in distributed sparse computation to reduce communication.

## 95.32 Sparse Matrices in Graphs

For a graph with adjacency matrix \(A\),

$$
a_{ij}\ne 0
$$

means there is an edge from vertex \(i\) to vertex \(j\).

The graph Laplacian is

$$
L=D-A,
$$

where \(D\) is the degree matrix.

Graph Laplacians are sparse when the graph has relatively few edges compared with \(n^2\).

They appear in spectral graph theory, network analysis, ranking, clustering, electrical networks, and numerical PDEs.

## 95.33 Sparse Matrices in PDEs

Discretized partial differential equations often produce sparse matrices.

For example, a one-dimensional second-difference operator gives a tridiagonal matrix.

A two-dimensional grid Laplacian gives a sparse matrix with a small stencil, often five nonzeros per row.

A three-dimensional grid Laplacian gives a seven-point stencil.

The number of nonzeros grows linearly with the number of grid points, while dense storage would grow quadratically.

## 95.34 Practical Checklist

When working with a sparse matrix, ask:

| Question | Why it matters |
|---|---|
| What is \(\operatorname{nnz}(A)\)? | determines storage and multiplication cost |
| Is the matrix symmetric? | determines solver choices |
| Is it positive definite? | enables Cholesky or CG |
| Is the pattern structured? | enables special formats or orderings |
| Are many right-hand sides needed? | may favor direct factorization |
| Is a good preconditioner available? | controls iterative convergence |
| Is construction incremental? | use COO, DOK, or LIL first |
| Is arithmetic repeated? | convert to CSR or CSC |

The format and algorithm should match the operation.

## 95.35 Summary

A sparse matrix stores mostly zero entries and is handled by algorithms that avoid unnecessary work with those zeros.

The central quantity is

$$
\operatorname{nnz}(A),
$$

the number of nonzero entries.

Sparse storage formats include:

| Format | Main use |
|---|---|
| COO | construction from triples |
| CSR | row access and matrix-vector products |
| CSC | column access and direct solvers |
| DOK, LIL | incremental construction |
| DIA | banded and diagonal structure |
| BSR | block sparse structure |

Sparse computation is governed by sparsity patterns, fill-in, ordering, and memory access. Sparse matrix-vector multiplication is the core operation in iterative methods. Sparse direct methods rely on factorization while controlling fill-in. Iterative methods rely on matrix-vector products and preconditioning.

Sparse matrices make large-scale linear algebra possible when dense storage and dense factorization are infeasible.