# Chapter 96. Structured Matrices

# Chapter 96. Structured Matrices

A structured matrix is a matrix whose entries satisfy a special pattern or rule. The structure may be algebraic, geometric, combinatorial, or computational.

Sparse matrices are one kind of structured matrix. Their structure is the location of zero and nonzero entries. But structure is broader than sparsity. A matrix may be dense and still structured. Toeplitz, Hankel, circulant, Vandermonde, diagonal, triangular, banded, symmetric, orthogonal, and low-rank matrices all have special forms that can be exploited.

Structured linear algebra studies how these forms affect storage, computation, conditioning, and algorithms. The purpose is to avoid treating every matrix as an arbitrary dense array when more information is available.

## 96.1 What Structure Means

A general \(m\times n\) matrix has

$$
mn
$$

independent entries.

A structured matrix has fewer independent degrees of freedom.

For example, a diagonal \(n\times n\) matrix has only \(n\) independent entries. A symmetric \(n\times n\) matrix has

$$
\frac{n(n+1)}{2}
$$

independent entries. A Toeplitz \(n\times n\) matrix has only

$$
2n-1
$$

independent entries because each descending diagonal is constant. Toeplitz matrices are commonly defined by the rule \(a_{ij}=c_{i-j}\), so entries on each descending diagonal are equal.

Structure reduces freedom. Reduced freedom often gives faster algorithms.

## 96.2 Examples of Structured Matrices

Common structured matrices include:

| Structure | Defining property |
|---|---|
| Diagonal | \(a_{ij}=0\) for \(i\ne j\) |
| Triangular | entries vanish above or below diagonal |
| Banded | nonzeros confined near diagonal |
| Symmetric | \(A^T=A\) |
| Skew-symmetric | \(A^T=-A\) |
| Orthogonal | \(Q^TQ=I\) |
| Toeplitz | \(a_{ij}=c_{i-j}\) |
| Hankel | \(a_{ij}=c_{i+j}\) |
| Circulant | each row is a cyclic shift of previous row |
| Vandermonde | \(a_{ij}=x_i^{j-1}\) |
| Cauchy | \(a_{ij}=1/(x_i-y_j)\) |
| Low-rank | rank much smaller than dimensions |
| Kronecker product | built from tensor products of smaller matrices |

A matrix may have several structures at once. A symmetric tridiagonal matrix is symmetric, sparse, banded, and structured.

## 96.3 Why Structure Matters

Structure matters for four main reasons.

| Reason | Effect |
|---|---|
| Storage | fewer values must be stored |
| Speed | specialized algorithms reduce work |
| Stability | structure may improve or damage conditioning |
| Interpretation | structure reflects the model that produced the matrix |

For example, a dense \(n\times n\) matrix needs \(n^2\) stored values. A Toeplitz matrix needs only \(2n-1\). A circulant matrix needs only \(n\). A diagonal matrix needs only \(n\).

The computational difference can be large. Multiplying by a dense matrix usually costs \(O(n^2)\). Multiplying by a diagonal matrix costs \(O(n)\). Multiplying by a circulant matrix can be done through the fast Fourier transform in \(O(n\log n)\).

## 96.4 Diagonal Matrices

A diagonal matrix has the form

$$
D=
\begin{bmatrix}
d_1 & 0 & \cdots & 0\\
0 & d_2 & \cdots & 0\\
\vdots & \vdots & \ddots & \vdots\\
0 & 0 & \cdots & d_n
\end{bmatrix}.
$$

Multiplication is componentwise:

$$
(Dx)_i=d_ix_i.
$$

Solving

$$
Dx=b
$$

is also componentwise:

$$
x_i=\frac{b_i}{d_i},
$$

provided \(d_i\ne 0\).

Diagonal matrices are the simplest structured matrices. They represent independent scaling in coordinate directions.

## 96.5 Triangular Matrices

A lower triangular matrix has entries

$$
a_{ij}=0
\qquad
\text{for } j>i.
$$

An upper triangular matrix has entries

$$
a_{ij}=0
\qquad
\text{for } i>j.
$$

Triangular systems are solved by substitution.

For lower triangular \(Lx=b\), forward substitution computes

$$
x_i=
\frac{1}{\ell_{ii}}
\left(
b_i-\sum_{j<i}\ell_{ij}x_j
\right).
$$

Triangular matrices are central because many factorizations reduce a problem to triangular solves:

| Factorization | Triangular part |
|---|---|
| LU | \(L\) and \(U\) |
| Cholesky | \(L\) and \(L^T\) |
| QR | \(R\) |

## 96.6 Banded Matrices

A banded matrix has nonzero entries only near the main diagonal.

A matrix has lower bandwidth \(p\) and upper bandwidth \(q\) if

$$
a_{ij}=0
\qquad
\text{whenever } i-j>p
$$

or

$$
j-i>q.
$$

A tridiagonal matrix has

$$
p=q=1.
$$

Banded matrices arise from local interactions, especially finite difference discretizations of differential equations.

Specialized algorithms for banded systems store only the bands and avoid arithmetic outside them.

## 96.7 Tridiagonal Matrices

A tridiagonal matrix has the form

$$
T=
\begin{bmatrix}
d_1 & u_1 & 0 & \cdots & 0\\
\ell_1 & d_2 & u_2 & \cdots & 0\\
0 & \ell_2 & d_3 & \cdots & 0\\
\vdots & \vdots & \vdots & \ddots & u_{n-1}\\
0 & 0 & 0 & \ell_{n-1} & d_n
\end{bmatrix}.
$$

It has at most

$$
3n-2
$$

nonzero entries.

Tridiagonal systems can be solved in \(O(n)\) time by the Thomas algorithm, a specialized form of Gaussian elimination.

Symmetric tridiagonal matrices also appear in eigenvalue algorithms after reducing a symmetric dense matrix by orthogonal transformations.

## 96.8 Symmetric Matrices

A symmetric matrix satisfies

$$
A^T=A.
$$

Thus

$$
a_{ij}=a_{ji}.
$$

Only the upper or lower triangle must be stored.

Symmetric matrices have important spectral properties. If \(A\) is real symmetric, then its eigenvalues are real and it has an orthonormal eigenbasis.

Symmetry is more than a storage property. It changes which algorithms are valid.

| Problem | Symmetric algorithm |
|---|---|
| positive definite system | Cholesky factorization |
| eigenvalue problem | symmetric QR, Lanczos |
| large SPD system | conjugate gradient |
| indefinite symmetric system | MINRES |

## 96.9 Orthogonal Matrices

An orthogonal matrix satisfies

$$
Q^TQ=I.
$$

Its inverse is its transpose:

$$
Q^{-1}=Q^T.
$$

Orthogonal matrices preserve Euclidean norms:

$$
\|Qx\|_2=\|x\|_2.
$$

This makes them numerically stable. Householder reflectors, Givens rotations, and orthogonal factors in QR decompositions are essential tools in numerical linear algebra.

Orthogonal structure is often preserved deliberately because it controls rounding error.

## 96.10 Toeplitz Matrices

A Toeplitz matrix is constant along descending diagonals:

$$
a_{ij}=c_{i-j}.
$$

A typical Toeplitz matrix is

$$
T=
\begin{bmatrix}
c_0 & c_{-1} & c_{-2} & \cdots & c_{-(n-1)}\\
c_1 & c_0 & c_{-1} & \cdots & c_{-(n-2)}\\
c_2 & c_1 & c_0 & \cdots & c_{-(n-3)}\\
\vdots & \vdots & \vdots & \ddots & \vdots\\
c_{n-1} & c_{n-2} & c_{n-3} & \cdots & c_0
\end{bmatrix}.
$$

Toeplitz matrices are closely connected to convolution. Linear convolution can be written as multiplication by a Toeplitz matrix.

They appear in signal processing, time series, stationary covariance models, and discretized translation-invariant operators.

## 96.11 Hankel Matrices

A Hankel matrix is constant along anti-diagonals:

$$
a_{ij}=c_{i+j}.
$$

For example,

$$
H=
\begin{bmatrix}
c_2 & c_3 & c_4 & c_5\\
c_3 & c_4 & c_5 & c_6\\
c_4 & c_5 & c_6 & c_7\\
c_5 & c_6 & c_7 & c_8
\end{bmatrix}.
$$

Hankel matrices appear in moment problems, system identification, control theory, signal processing, and Padé approximation.

A Hankel matrix is closely related to a Toeplitz matrix after reversing row or column order. Sources often describe Hankel matrices as reversed Toeplitz matrices.

## 96.12 Circulant Matrices

A circulant matrix is determined by one vector. Each row is a cyclic shift of the previous row.

For example,

$$
C=
\begin{bmatrix}
c_0 & c_1 & c_2 & c_3\\
c_3 & c_0 & c_1 & c_2\\
c_2 & c_3 & c_0 & c_1\\
c_1 & c_2 & c_3 & c_0
\end{bmatrix}.
$$

Circulant matrices are special Toeplitz matrices with wraparound structure.

Their main computational property is that they are diagonalized by the discrete Fourier transform. Thus multiplication by a circulant matrix can be computed using FFT methods.

This makes circulant matrices important in periodic convolution, signal processing, image processing, and preconditioning.

## 96.13 Vandermonde Matrices

A Vandermonde matrix has the form

$$
V=
\begin{bmatrix}
1 & x_1 & x_1^2 & \cdots & x_1^{n-1}\\
1 & x_2 & x_2^2 & \cdots & x_2^{n-1}\\
\vdots & \vdots & \vdots & \ddots & \vdots\\
1 & x_n & x_n^2 & \cdots & x_n^{n-1}
\end{bmatrix}.
$$

Vandermonde matrices appear in polynomial interpolation, quadrature, moment fitting, and signal processing.

They are structured because the entire matrix is determined by the nodes

$$
x_1,x_2,\ldots,x_n.
$$

However, Vandermonde matrices can be severely ill-conditioned. Their conditioning depends strongly on the distribution of the nodes; one study notes exponential conditioning behavior unless the knots are roughly equally spaced on or near the unit circle.

## 96.14 Cauchy Matrices

A Cauchy matrix has entries

$$
a_{ij}=\frac{1}{x_i-y_j},
$$

where

$$
x_i\ne y_j.
$$

Cauchy matrices appear in rational interpolation, integral equations, approximation theory, and structured direct solvers.

Like Vandermonde matrices, Cauchy matrices are dense but highly structured. Their entries are determined by two short vectors:

$$
x_1,\ldots,x_m
$$

and

$$
y_1,\ldots,y_n.
$$

This structure can be exploited in fast algorithms and displacement-rank methods.

## 96.15 Low-Rank Matrices

A matrix \(A\in\mathbb{R}^{m\times n}\) has rank \(r\) if its columns span an \(r\)-dimensional space.

If

$$
r\ll \min(m,n),
$$

then \(A\) is low-rank.

A low-rank matrix can often be written as

$$
A=UV^T,
$$

where

$$
U\in\mathbb{R}^{m\times r},
\qquad
V\in\mathbb{R}^{n\times r}.
$$

This representation stores

$$
r(m+n)
$$

numbers instead of

$$
mn.
$$

Low-rank structure appears in data compression, principal component analysis, integral equations, recommender systems, and model reduction.

## 96.16 Rank-Structured Matrices

Some matrices are not globally low-rank but contain low-rank off-diagonal blocks.

Such matrices are called rank-structured.

Examples include:

| Type | Basic idea |
|---|---|
| H-matrix | hierarchical block low-rank structure |
| HSS matrix | hierarchically semiseparable |
| HODLR matrix | hierarchically off-diagonal low-rank |
| quasiseparable matrix | low-rank subblocks in off-diagonal regions |

These structures occur in integral equations, kernel methods, covariance matrices, and discretized operators with smooth long-range interactions.

They allow approximate storage and fast multiplication or factorization.

## 96.17 Kronecker Product Structure

The Kronecker product of matrices \(A\) and \(B\) is written

$$
A\otimes B.
$$

If

$$
A=[a_{ij}],
$$

then

$$
A\otimes B =
\begin{bmatrix}
a_{11}B & a_{12}B & \cdots\\
a_{21}B & a_{22}B & \cdots\\
\vdots & \vdots & \ddots
\end{bmatrix}.
$$

Kronecker products appear in tensor grids, separable differential operators, multidimensional transforms, quantum mechanics, statistics, and signal processing.

They allow large matrices to be represented by smaller factors.

For example, a two-dimensional separable operator may be written as

$$
A_x\otimes I + I\otimes A_y.
$$

This form enables fast matrix-vector products without constructing the full matrix explicitly.

## 96.18 Block Matrices

Block structure divides a matrix into submatrices:

$$
A=
\begin{bmatrix}
A_{11} & A_{12}\\
A_{21} & A_{22}
\end{bmatrix}.
$$

Block structure may reflect multiple variables, domains, time steps, or physical components.

Block methods can improve cache behavior and expose mathematical structure. For example, saddle-point systems often have the form

$$
\begin{bmatrix}
A & B^T\\
B & 0
\end{bmatrix}.
$$

Recognizing the block form helps choose solvers and preconditioners.

## 96.19 Displacement Structure

Some structured matrices can be described by displacement operators.

A matrix \(A\) has low displacement rank if an expression such as

$$
A - ZAZ^T
$$

has low rank for a simple shift matrix \(Z\).

Toeplitz, Hankel, Vandermonde-like, and Cauchy-like matrices can often be described in this framework. Structured linear algebra literature uses displacement operators to unify fast algorithms for these matrix classes.

The point is that a matrix may be dense, but its departure from a shifted version of itself is simple.

This hidden simplicity supports fast direct solvers.

## 96.20 Structured Storage

Structured storage records only the data needed to reconstruct the matrix.

| Matrix | Minimal data |
|---|---|
| diagonal | diagonal vector |
| tridiagonal | three diagonals |
| banded | stored bands |
| symmetric | one triangle |
| Toeplitz | first row and first column |
| circulant | first row or column |
| Vandermonde | nodes \(x_i\) |
| Cauchy | vectors \(x_i\), \(y_j\) |
| low-rank | factors \(U,V\) |
| Kronecker | component matrices |

The matrix may never be materialized as a full dense array.

Instead, algorithms operate directly on the representation.

## 96.21 Structured Matrix-Vector Multiplication

For a general dense matrix,

$$
y=Ax
$$

costs

$$
O(n^2).
$$

For structured matrices, the cost may be lower.

| Structure | Multiplication cost |
|---|---:|
| diagonal | \(O(n)\) |
| tridiagonal | \(O(n)\) |
| banded with bandwidth \(w\) | \(O(wn)\) |
| sparse | \(O(\operatorname{nnz}(A))\) |
| circulant | \(O(n\log n)\) |
| Toeplitz using FFT embedding | \(O(n\log n)\) |
| low-rank \(UV^T\) | \(O(r(m+n))\) |

These savings are often the reason to preserve structure.

## 96.22 Structured Solvers

A structured solver uses the matrix form directly.

Examples include:

| Matrix type | Solver idea |
|---|---|
| diagonal | componentwise division |
| triangular | substitution |
| tridiagonal | Thomas algorithm |
| symmetric positive definite | Cholesky |
| Toeplitz | Levinson-type or FFT-based methods |
| circulant | FFT diagonalization |
| low-rank update | Sherman-Morrison-Woodbury |
| Kronecker sum | separable transforms |

A general dense solver ignores this structure and may use more time and memory than necessary.

## 96.23 Structure and Conditioning

Structure does not automatically imply good conditioning.

A diagonal matrix may be ill-conditioned if the ratio

$$
\frac{\max_i |d_i|}{\min_i |d_i|}
$$

is large.

A Vandermonde matrix may be extremely ill-conditioned for poorly chosen nodes.

A Toeplitz matrix may be well-conditioned or ill-conditioned depending on its generating sequence.

Therefore one must distinguish:

| Concept | Meaning |
|---|---|
| structural simplicity | few parameters or special pattern |
| numerical conditioning | sensitivity to perturbations |
| algorithmic stability | behavior under floating point computation |

A matrix can be simple to describe and still difficult to solve accurately.

## 96.24 Preserving Structure

Some operations preserve structure; others destroy it.

| Operation | Possible effect |
|---|---|
| adding two Toeplitz matrices | Toeplitz |
| multiplying two diagonal matrices | diagonal |
| inverse of diagonal matrix | diagonal |
| inverse of triangular matrix | triangular |
| product of Toeplitz matrices | generally not Toeplitz |
| LU factorization of sparse matrix | may create fill-in |
| perturbing a symmetric matrix asymmetrically | destroys symmetry |

Algorithms should preserve structure when doing so improves accuracy or speed.

For example, using symmetric eigensolvers on symmetric matrices avoids unnecessary complex arithmetic and improves numerical behavior.

## 96.25 Structured Perturbations

In ordinary perturbation analysis, a perturbation \(\Delta A\) may be any matrix.

In structured perturbation analysis, \(\Delta A\) must preserve the structure.

For example, if \(A\) is Toeplitz, a structured perturbation must also be Toeplitz.

Structured condition numbers may differ from unstructured condition numbers. Some analyses explicitly compare structured and unstructured sensitivity for matrix classes such as Toeplitz, Hankel, circulant, and symmetric matrices.

This matters when the data comes from a structured model. Perturbations outside the model may be irrelevant.

## 96.26 Applications

Structured matrices appear across applied mathematics.

| Application | Structure |
|---|---|
| signal processing | Toeplitz, circulant, Hankel |
| time series | Toeplitz covariance |
| image processing | block Toeplitz, circulant approximations |
| PDEs | sparse, banded, Kronecker sums |
| graph algorithms | sparse Laplacian |
| polynomial interpolation | Vandermonde |
| rational interpolation | Cauchy |
| statistics | covariance and kernel structure |
| optimization | block, sparse, low-rank Hessians |
| quantum mechanics | Kronecker products |

Recognizing the source of the matrix usually suggests the correct algorithm.

## 96.27 Matrix-Free Structured Operators

Some structured matrices are better represented as operators than arrays.

For example, convolution may be implemented by FFTs rather than by forming a Toeplitz matrix.

A Kronecker sum may be applied by reshaping vectors into arrays and multiplying along dimensions.

A low-rank matrix may be applied as

$$
x\mapsto U(V^Tx).
$$

These forms avoid explicit storage and reduce cost.

Matrix-free structured operators are common in large-scale scientific computing and machine learning.

## 96.28 Practical Checklist

When given a matrix, ask:

| Question | Reason |
|---|---|
| Is the matrix sparse? | choose sparse storage and solvers |
| Is it symmetric? | use symmetric algorithms |
| Is it positive definite? | use Cholesky or conjugate gradient |
| Is it banded? | use band storage and band solvers |
| Is it Toeplitz or circulant? | exploit convolution or FFT methods |
| Is it low-rank or approximately low-rank? | use factorized representations |
| Is it a Kronecker product or sum? | avoid forming the full matrix |
| Is the structure exact or approximate? | controls algorithm choice |
| Does the algorithm preserve structure? | affects speed and stability |

The main rule is simple: do not discard useful structure too early.

## 96.29 Summary

Structured matrices are matrices with special patterns or representations.

Important examples include diagonal, triangular, banded, symmetric, orthogonal, Toeplitz, Hankel, circulant, Vandermonde, Cauchy, low-rank, block, sparse, and Kronecker-structured matrices.

Structure affects:

| Aspect | Effect |
|---|---|
| Storage | fewer parameters |
| Multiplication | faster matrix-vector products |
| Solving | specialized algorithms |
| Eigenvalue computation | specialized reductions |
| Conditioning | structured sensitivity may differ |
| Modeling | structure reflects the problem source |

Structured matrices are central in numerical linear algebra because they allow algorithms to use the information already present in the problem. Treating a structured matrix as a general dense matrix is often wasteful and may hide the best method.