Skip to content

Chapter 16. Matrix Factorizations Overview

A matrix factorization writes a matrix as a product of simpler matrices. The factors are chosen so that they reveal structure or make computation easier. Common factorizations include LU, PLU, Cholesky, QR, Schur, eigenvalue decomposition, and singular value decomposition. These factorizations are used to solve linear systems, compute determinants, estimate rank, solve least squares problems, analyze geometry, and design stable numerical algorithms.

16.1 What a Factorization Is

A factorization has the form

A=BC A = BC

or, more generally,

A=B1B2Bk. A = B_1B_2\cdots B_k.

The matrix AA is replaced by factors with useful structure.

For example,

A=LU A = LU

writes AA as a product of a lower triangular matrix LL and an upper triangular matrix UU.

This is useful because triangular systems are easy to solve. Instead of solving

Ax=b, Ax=b,

one solves

LUx=b. LUx=b.

Let

Ux=y. Ux=y.

Then solve

Ly=b Ly=b

by forward substitution, and solve

Ux=y Ux=y

by back substitution.

16.2 Why Factorizations Matter

A factorization separates a hard problem into simpler problems.

Many matrix problems are expensive when handled directly. Factorization replaces a general matrix by structured factors. These factors may be triangular, diagonal, orthogonal, unitary, symmetric, or low rank.

The common purposes are:

PurposeFactorization examples
Solve linear systemsLU, PLU, Cholesky
Solve least squares problemsQR, SVD
Study eigenvaluesSchur, spectral decomposition
Estimate rankSVD, rank-revealing QR
Compute determinantsLU, Cholesky
Compress dataSVD, low-rank factorization
Improve numerical stabilityQR, SVD, pivoted factorizations

A factorization is therefore both algebraic and computational. It rewrites the matrix while preserving the problem.

16.3 Triangular Matrices as Building Blocks

Triangular matrices are central because triangular systems can be solved directly.

An upper triangular system has the form

Ux=b, Ux=b,

where

U=[u11u12u1n0u22u2n00unn]. U= \begin{bmatrix} u_{11}&u_{12}&\cdots&u_{1n}\\ 0&u_{22}&\cdots&u_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&u_{nn} \end{bmatrix}.

The last equation gives xnx_n. Then the previous equation gives xn1x_{n-1}, and so on.

A lower triangular system is solved in the opposite direction, using forward substitution.

This explains why LU, PLU, and Cholesky factorizations are useful.

16.4 LU Factorization

An LU factorization writes a square matrix as

A=LU, A=LU,

where LL is lower triangular and UU is upper triangular.

Usually LL is chosen with diagonal entries equal to 11. Such an LL is called unit lower triangular.

LU factorization records Gaussian elimination as a matrix product. The entries of LL store the multipliers used during elimination, and UU is the resulting upper triangular matrix.

For example, if

A=[2167], A= \begin{bmatrix} 2&1\\ 6&7 \end{bmatrix},

then eliminating the 66 below the pivot 22 uses multiplier 33. We get

U=[2104]. U= \begin{bmatrix} 2&1\\ 0&4 \end{bmatrix}.

The corresponding lower triangular factor is

L=[1031]. L= \begin{bmatrix} 1&0\\ 3&1 \end{bmatrix}.

Indeed,

LU=[1031][2104]=[2167]. LU= \begin{bmatrix} 1&0\\ 3&1 \end{bmatrix} \begin{bmatrix} 2&1\\ 0&4 \end{bmatrix} = \begin{bmatrix} 2&1\\ 6&7 \end{bmatrix}.

16.5 PLU Factorization

Some matrices require row swaps during elimination. In that case, the factorization is written as

PA=LU, PA=LU,

or equivalently

A=PTLU, A=P^TLU,

where PP is a permutation matrix.

The matrix PP records the row swaps. The matrices LL and UU record the elimination after the swaps.

PLU factorization is more robust than plain LU because it handles zero or unsuitable pivot entries.

For example, the matrix

A=[0123] A= \begin{bmatrix} 0&1\\ 2&3 \end{bmatrix}

cannot start elimination with the entry 00 in the first pivot position. Swapping rows gives

PA=[2301]. PA= \begin{bmatrix} 2&3\\ 0&1 \end{bmatrix}.

This is already upper triangular. The permutation matrix records the row swap.

16.6 Cholesky Factorization

Cholesky factorization applies to real symmetric positive definite matrices and complex Hermitian positive definite matrices. In the real case, it writes

A=LLT, A=LL^T,

where LL is lower triangular with positive diagonal entries. In the complex case, the form is

A=LL, A=LL^*,

where LL^* is the conjugate transpose. Cholesky is especially useful for solving systems with symmetric positive definite matrices and is more efficient than general LU when its assumptions hold.

For example,

[4223]=[2012][2102]. \begin{bmatrix} 4&2\\ 2&3 \end{bmatrix} = \begin{bmatrix} 2&0\\ 1&\sqrt{2} \end{bmatrix} \begin{bmatrix} 2&1\\ 0&\sqrt{2} \end{bmatrix}.

The matrix is decomposed into a triangular factor and its transpose.

Cholesky is common in optimization, statistics, least squares, covariance matrices, and numerical solutions of positive definite systems.

16.7 QR Factorization

QR factorization writes a matrix as

A=QR, A=QR,

where QQ has orthonormal columns and RR is upper triangular.

For a square real matrix with full rank, QQ is orthogonal:

QTQ=I. Q^TQ=I.

For a complex matrix, QQ is unitary:

QQ=I. Q^*Q=I.

QR factorization is closely related to Gram-Schmidt orthogonalization. It replaces the columns of AA by orthonormal directions and records the coefficients in RR.

QR is especially important for least squares problems. It avoids forming the normal equations

ATAx=ATb, A^TAx=A^Tb,

which can worsen numerical conditioning.

16.8 Singular Value Decomposition

The singular value decomposition, or SVD, writes an m×nm\times n matrix as

A=UΣVT A=U\Sigma V^T

in the real case, or

A=UΣV A=U\Sigma V^*

in the complex case.

Here UU and VV are orthogonal or unitary matrices, and Σ\Sigma is diagonal except for possible extra zero rows or columns.

The diagonal entries of Σ\Sigma are the singular values:

σ1,σ2,. \sigma_1,\sigma_2,\ldots.

They are usually ordered as

σ1σ20. \sigma_1\ge \sigma_2\ge \cdots \ge 0.

The SVD applies to every matrix. It is one of the most powerful factorizations because it reveals rank, null space, column space, conditioning, and best low-rank approximations. It is also useful for least squares problems, especially when the matrix is close to rank deficient.

16.9 Eigenvalue Decomposition

For some square matrices, one can write

A=PDP1, A=PDP^{-1},

where DD is diagonal.

The columns of PP are eigenvectors of AA, and the diagonal entries of DD are eigenvalues.

This factorization is possible when AA has enough linearly independent eigenvectors. In that case, AA is diagonalizable.

Diagonalization is useful because powers and functions of AA become easier:

Ak=PDkP1. A^k=PD^kP^{-1}.

Since DD is diagonal, DkD^k is obtained by raising each diagonal entry to the kk-th power.

16.10 Spectral Decomposition

For real symmetric matrices and complex Hermitian matrices, the eigenvalue decomposition has a stronger form.

In the real symmetric case,

A=QΛQT, A=Q\Lambda Q^T,

where QQ is orthogonal and Λ\Lambda is diagonal.

In the complex Hermitian case,

A=QΛQ, A=Q\Lambda Q^*,

where QQ is unitary and Λ\Lambda is real diagonal.

This is called spectral decomposition. It is a central result of inner product linear algebra. It says that symmetric or Hermitian matrices can be understood using orthonormal eigenvectors.

Spectral decomposition is used in quadratic forms, principal component analysis, differential equations, statistics, and optimization.

16.11 Schur Decomposition

The Schur decomposition writes a square matrix as

A=QTQ, A=QTQ^*,

where QQ is unitary and TT is upper triangular.

For real matrices, a real Schur form also exists, with QQ orthogonal and TT quasi-upper triangular.

Schur decomposition is more general than diagonalization. Every complex square matrix has a Schur decomposition, even when it is not diagonalizable.

The diagonal entries of TT are the eigenvalues of AA. Thus Schur form is a stable way to study eigenvalues in numerical linear algebra.

16.12 Polar Decomposition

The polar decomposition writes a matrix as

A=UP, A=UP,

where UU is orthogonal or unitary in the square nonsingular case, and PP is positive semidefinite.

It is the matrix analogue of writing a complex number as

z=eiθr. z=e^{i\theta}r.

The factor UU represents rotation or reflection. The factor PP represents stretching.

Polar decomposition is closely related to the SVD. If

A=U1ΣV A=U_1\Sigma V^*

is an SVD, then

A=(U1V)(VΣV). A=(U_1V^*)(V\Sigma V^*).

The first factor is orthogonal or unitary, and the second is positive semidefinite.

16.13 Rank Factorization

If AA has rank rr, then it can be written as

A=BC, A=BC,

where BB is m×rm\times r, CC is r×nr\times n, and both have rank rr.

This is called a rank factorization.

It separates the matrix into a map from FnF^n to FrF^r, followed by a map from FrF^r to FmF^m. It makes the effective dimension rr explicit.

For example, if the columns of AA lie in a two-dimensional subspace, then AA can be factored through F2F^2.

16.14 Low-Rank Approximation

A low-rank approximation replaces AA by a matrix of smaller rank.

The SVD gives the most important construction. If

A=UΣVT A=U\Sigma V^T

and the singular values are ordered, then the best rank-kk approximation in the Euclidean and Frobenius norms is obtained by keeping only the largest kk singular values.

The approximation has the form

Ak=UkΣkVkT. A_k=U_k\Sigma_kV_k^T.

This is used in data compression, noise reduction, recommender systems, principal component analysis, and scientific computing.

16.15 Factorizations and Solving Systems

Different factorizations solve different kinds of systems.

If

A=LU, A=LU,

then

Ax=b Ax=b

becomes

LUx=b. LUx=b.

Solve

Ly=b Ly=b

then

Ux=y. Ux=y.

If

A=QR, A=QR,

then least squares problems are simplified. The problem

minxAxb \min_x \|Ax-b\|

becomes

minxQRxb. \min_x \|QRx-b\|.

Since QQ preserves length,

QRxb \|QRx-b\|

can be transformed into a triangular least squares problem involving RR.

If

A=UΣVT, A=U\Sigma V^T,

then solving and approximating systems can be expressed in terms of singular values.

16.16 Factorizations and Determinants

Factorizations can simplify determinant computation.

If

A=LU A=LU

and LL has diagonal entries all equal to 11, then

det(A)=det(U). \det(A)=\det(U).

Since UU is triangular,

det(U) \det(U)

is the product of its diagonal entries.

If row swaps are used,

PA=LU. PA=LU.

Then

det(P)det(A)=det(L)det(U). \det(P)\det(A)=\det(L)\det(U).

Since det(P)\det(P) is 11 or 1-1, this tracks the sign change from row swaps.

For Cholesky,

A=LLT A=LL^T

gives

det(A)=det(L)2. \det(A)=\det(L)^2.

Since LL is triangular,

det(A)=(ilii)2. \det(A)=\left(\prod_i l_{ii}\right)^2.

16.17 Factorizations and Geometry

Factorizations often separate geometric actions.

For example, the SVD writes

A=UΣVT. A=U\Sigma V^T.

This means that AA acts in three stages:

StageAction
VTV^TRotate or reflect the input coordinates
Σ\SigmaScale along coordinate axes
UURotate or reflect the output coordinates

Thus every real matrix acts like a coordinate rotation, followed by axis scaling, followed by another coordinate rotation, after allowing collapse in zero singular-value directions.

This gives a geometric explanation of rank, conditioning, and low-rank approximation.

16.18 Choosing a Factorization

The correct factorization depends on the matrix and the problem.

Problem or matrix typeCommon choice
General square systemLU or PLU
Symmetric positive definite systemCholesky
Least squaresQR or SVD
Rank-deficient or nearly rank-deficient problemSVD
Eigenvalue problemSchur or spectral decomposition
Symmetric or Hermitian matrixSpectral decomposition
Data compressionTruncated SVD
Repeated solves with same matrixFactor once, solve many times

The choice is guided by structure, stability, cost, and the information needed.

16.19 Exact and Numerical Factorizations

In exact algebra, a factorization is an identity.

In numerical computation, a computed factorization is approximate. Instead of

A=LU, A=LU,

a computer may produce factors satisfying

ALU. A \approx LU.

The quality of a factorization is judged by error, stability, conditioning, and sensitivity to perturbations.

Pivoting is often used to improve numerical behavior. Orthogonal and unitary factors are valuable because they preserve norms and reduce error growth.

This is why QR and SVD are often preferred for least squares and rank-sensitive problems.

16.20 Common Mistakes

MistakeCorrection
Assuming every matrix has an LU factorization without pivotingSome matrices require row swaps
Using Cholesky for any symmetric matrixCholesky requires positive definiteness
Treating QR as only for square matricesQR also applies to rectangular matrices
Assuming every square matrix is diagonalizableSome square matrices lack enough eigenvectors
Confusing eigenvalues with singular valuesSingular values are always nonnegative
Computing an inverse when a solve is enoughUse the factorization to solve triangular systems
Ignoring numerical stabilityPivoting, QR, and SVD often matter in floating point arithmetic

16.21 Summary

A matrix factorization rewrites a matrix as a product of structured factors. The structure makes computation and theory easier.

The main factorizations are:

FactorizationFormMain use
LUA=LUA=LUGaussian elimination, linear systems
PLUPA=LUPA=LULinear systems with pivoting
CholeskyA=LLTA=LL^T or A=LLA=LL^*Positive definite systems
QRA=QRA=QRLeast squares, orthogonalization
SVDA=UΣVA=U\Sigma V^*Rank, conditioning, low-rank approximation
EigenvalueA=PDP1A=PDP^{-1}Diagonalizable square matrices
SpectralA=QΛQA=Q\Lambda Q^*Hermitian or symmetric matrices
SchurA=QTQA=QTQ^*General square matrices and eigenvalues
PolarA=UPA=UPGeometry of rotation and stretching

Factorizations turn matrices into simpler pieces. They are the main bridge between linear algebra as a theory and linear algebra as a practical computational tool.