# Chapter 57. Singular Value Decomposition

# Chapter 57. Singular Value Decomposition

The singular value decomposition, usually called the SVD, is one of the central factorizations in linear algebra. It applies to every matrix, including rectangular and rank-deficient matrices. Unlike eigenvalue decomposition, it does not require the matrix to be square or diagonalizable.

The SVD expresses a matrix as

$$
A=U\Sigma V^*,
$$

where:

| Factor | Property |
|---|---|
| \(U\) | Unitary or orthogonal |
| \(\Sigma\) | Diagonal with nonnegative entries |
| \(V\) | Unitary or orthogonal |

The diagonal entries of \(\Sigma\) are called the singular values of \(A\).

The SVD explains the geometry of a matrix completely. It shows that every linear transformation can be decomposed into:

1. a rotation or unitary transformation,
2. a coordinate scaling,
3. another rotation or unitary transformation.

This factorization is fundamental in numerical linear algebra, data analysis, signal processing, statistics, optimization, machine learning, and inverse problems. Singular value decomposition factorizes any matrix into orthogonal or unitary factors together with a diagonal scaling matrix of singular values. ([en.wikipedia.org](https://en.wikipedia.org/wiki/Singular_value_decomposition?utm_source=chatgpt.com))

## 57.1 Definition of the SVD

Let

$$
A\in\mathbb{C}^{m\times n}.
$$

A singular value decomposition of \(A\) is a factorization

$$
A=U\Sigma V^*,
$$

where:

- \(U\in\mathbb{C}^{m\times m}\) is unitary,
- \(V\in\mathbb{C}^{n\times n}\) is unitary,
- \(\Sigma\in\mathbb{R}^{m\times n}\) is diagonal in the rectangular sense.

The diagonal entries satisfy

$$
\sigma_1\ge\sigma_2\ge\cdots\ge\sigma_r>0,
$$

and all remaining diagonal entries are zero.

The numbers

$$
\sigma_1,\ldots,\sigma_r
$$

are the singular values of \(A\).

The number \(r\) equals the rank of \(A\).

In the real case, the matrices \(U\) and \(V\) are orthogonal, and the conjugate transpose becomes an ordinary transpose:

$$
A=U\Sigma V^T.
$$

## 57.2 Rectangular Diagonal Matrices

The matrix \(\Sigma\) has the form

$$
\Sigma=
\begin{bmatrix}
\sigma_1 & 0 & \cdots & 0\\
0 & \sigma_2 & \cdots & 0\\
\vdots & \vdots & \ddots & \vdots\\
0 & 0 & \cdots & \sigma_r\\
0 & 0 & \cdots & 0
\end{bmatrix}.
$$

If \(m>n\), then \(\Sigma\) has extra zero rows. If \(n>m\), then it has extra zero columns.

For example,

$$
\Sigma=
\begin{bmatrix}
5&0\\
0&2\\
0&0
\end{bmatrix}
$$

is a \(3\times 2\) diagonal matrix in the rectangular sense.

The singular values are always nonnegative.

## 57.3 Singular Values from Eigenvalues

The singular values of \(A\) are the square roots of the eigenvalues of

$$
A^*A.
$$

Since

$$
A^*A
$$

is Hermitian positive semidefinite, all its eigenvalues are real and nonnegative.

Suppose

$$
A^*Av=\lambda v.
$$

Then

$$
v^*A^*Av =
\lambda v^*v.
$$

But

$$
v^*A^*Av =
(Av)^*(Av) =
\|Av\|_2^2\ge 0.
$$

Thus

$$
\lambda\ge 0.
$$

The singular values are defined by

$$
\sigma_i=\sqrt{\lambda_i}.
$$

Therefore:

| Matrix | Spectrum |
|---|---|
| \(A^*A\) | Eigenvalues \(\lambda_i\ge 0\) |
| \(A\) | Singular values \(\sigma_i=\sqrt{\lambda_i}\) |

## 57.4 Right Singular Vectors

The eigenvectors of

$$
A^*A
$$

are called the right singular vectors of \(A\).

Suppose

$$
A^*Av_i=\sigma_i^2v_i.
$$

Choose orthonormal eigenvectors

$$
v_1,\ldots,v_n.
$$

Collect them into the matrix

$$
V=
\begin{bmatrix}
|&|&&|\\
v_1&v_2&\cdots&v_n\\
|&|&&|
\end{bmatrix}.
$$

Then

$$
V^*V=I.
$$

Thus \(V\) is unitary.

The matrix \(V\) gives the input coordinate directions of the transformation.

## 57.5 Left Singular Vectors

For every nonzero singular value \(\sigma_i\), define

$$
u_i=\frac{Av_i}{\sigma_i}.
$$

Then

$$
Au_i?
$$

No. The correct relation is

$$
Av_i=\sigma_i u_i.
$$

The vectors \(u_i\) are called left singular vectors.

They satisfy

$$
A^*u_i=\sigma_i v_i.
$$

Indeed,

$$
A^*u_i =
A^*\frac{Av_i}{\sigma_i} =
\frac{A^*Av_i}{\sigma_i} =
\frac{\sigma_i^2v_i}{\sigma_i} =
\sigma_i v_i.
$$

The vectors \(u_i\) are orthonormal:

$$
u_i^*u_j =
\frac{1}{\sigma_i\sigma_j}
(Av_i)^*(Av_j).
$$

Since

$$
(Av_i)^*(Av_j) =
v_i^*A^*Av_j =
\sigma_j^2v_i^*v_j,
$$

we obtain

$$
u_i^*u_j=\delta_{ij}.
$$

## 57.6 Constructing the SVD

Let

$$
A^*A=V\Lambda V^*
$$

be a unitary diagonalization.

The diagonal matrix \(\Lambda\) contains the eigenvalues

$$
\lambda_i=\sigma_i^2.
$$

Define

$$
\Sigma=
\operatorname{diag}(\sigma_1,\ldots,\sigma_r).
$$

For each nonzero singular value, define

$$
u_i=\frac{Av_i}{\sigma_i}.
$$

Extend the orthonormal set

$$
u_1,\ldots,u_r
$$

to an orthonormal basis of \(\mathbb{C}^m\), and let

$$
U=
\begin{bmatrix}
|&|&&|\\
u_1&u_2&\cdots&u_m\\
|&|&&|
\end{bmatrix}.
$$

Then

$$
A=U\Sigma V^*.
$$

Thus every matrix has an SVD.

## 57.7 Geometric Interpretation

The SVD describes the action of a matrix geometrically.

The transformation

$$
x\mapsto Ax
$$

can be decomposed into three steps:

1. apply \(V^*\),
2. scale coordinates by \(\Sigma\),
3. apply \(U\).

The matrices \(U\) and \(V^*\) are unitary or orthogonal, so they preserve length and angle.

The matrix \(\Sigma\) stretches space independently along orthogonal coordinate directions.

Thus the singular values measure how strongly the matrix stretches different directions.

## 57.8 Unit Sphere Interpretation

Consider the unit sphere

$$
\{x:\|x\|_2=1\}.
$$

Under the transformation \(x\mapsto Ax\), the sphere becomes an ellipsoid.

The principal axes of this ellipsoid are:

| Quantity | Meaning |
|---|---|
| \(v_i\) | Input direction |
| \(u_i\) | Output direction |
| \(\sigma_i\) | Stretch factor |

Specifically,

$$
Av_i=\sigma_i u_i.
$$

Thus the matrix stretches the direction \(v_i\) by the factor \(\sigma_i\), then rotates it into the direction \(u_i\).

This geometric interpretation is one of the most important meanings of the SVD.

## 57.9 Rank and Singular Values

The rank of \(A\) equals the number of nonzero singular values.

Indeed,

$$
Av_i=\sigma_i u_i.
$$

If

$$
\sigma_i=0,
$$

then

$$
Av_i=0,
$$

so \(v_i\) lies in the null space of \(A\).

Thus:

| Quantity | Equals |
|---|---|
| Rank of \(A\) | Number of positive singular values |
| Nullity of \(A\) | Number of zero singular values |

The SVD therefore gives complete rank information.

## 57.10 Compact SVD

If \(A\) has rank \(r\), one may keep only the nonzero singular values.

Define:

$$
U_r=
\begin{bmatrix}
u_1&\cdots&u_r
\end{bmatrix},
$$

$$
V_r=
\begin{bmatrix}
v_1&\cdots&v_r
\end{bmatrix},
$$

and

$$
\Sigma_r=
\operatorname{diag}(\sigma_1,\ldots,\sigma_r).
$$

Then

$$
A=U_r\Sigma_rV_r^*.
$$

This is the compact SVD or thin SVD.

It removes directions associated with zero singular values.

## 57.11 Outer Product Expansion

The SVD can be written as

$$
A =
\sum_{i=1}^r
\sigma_i u_i v_i^*.
$$

Each term

$$
u_i v_i^*
$$

is a rank-one matrix.

Thus every matrix is a sum of rank-one matrices weighted by singular values.

This decomposition is fundamental in low-rank approximation and data compression.

## 57.12 Best Rank-\(k\) Approximation

One of the most important theorems about the SVD is the Eckart-Young theorem.

Suppose

$$
A =
\sum_{i=1}^r
\sigma_i u_i v_i^*.
$$

Define

$$
A_k =
\sum_{i=1}^k
\sigma_i u_i v_i^*.
$$

Then \(A_k\) is the best rank-\(k\) approximation to \(A\) in both the spectral norm and Frobenius norm.

The approximation error is

$$
\|A-A_k\|_2=\sigma_{k+1}.
$$

Thus truncating the SVD gives the optimal low-rank approximation.

This result is central in compression, PCA, latent semantic analysis, and recommender systems.

## 57.13 Example

Consider

$$
A=
\begin{bmatrix}
3&0\\
0&1
\end{bmatrix}.
$$

Then

$$
A^TA=
\begin{bmatrix}
9&0\\
0&1
\end{bmatrix}.
$$

The eigenvalues are

$$
9
\quad
\text{and}
\quad
1.
$$

Thus the singular values are

$$
\sigma_1=3,
\qquad
\sigma_2=1.
$$

The standard basis vectors are eigenvectors, so

$$
V=I,
\qquad
U=I.
$$

Hence

$$
A=I
\begin{bmatrix}
3&0\\
0&1
\end{bmatrix}
I.
$$

This matrix stretches the \(x\)-direction by \(3\) and the \(y\)-direction by \(1\).

## 57.14 Pseudoinverse

The SVD gives the Moore-Penrose pseudoinverse.

Suppose

$$
A=U\Sigma V^*.
$$

Define

$$
\Sigma^+
$$

by replacing each nonzero singular value \(\sigma_i\) by

$$
\frac{1}{\sigma_i}
$$

and transposing the rectangular structure.

Then

$$
A^+=V\Sigma^+U^*.
$$

The pseudoinverse gives:

| Problem | Solution |
|---|---|
| Least squares | \(x=A^+b\) |
| Minimum norm solution | \(x=A^+b\) |
| Projection | \(AA^+\) |

The pseudoinverse exists for every matrix.

## 57.15 Least Squares and the SVD

Suppose

$$
A=U\Sigma V^*.
$$

The least squares problem

$$
\min_x\|Ax-b\|_2
$$

becomes

$$
\min_x\|\Sigma V^*x-U^*b\|_2.
$$

Since \(U\) and \(V\) are unitary, they preserve norms.

The problem reduces to a diagonal system involving singular values.

If all singular values are positive, the least squares solution is

$$
x =
V\Sigma^{-1}U^*b.
$$

This equals

$$
A^+b.
$$

The SVD therefore solves least squares problems robustly, even for ill-conditioned or rank-deficient matrices.

## 57.16 Condition Number

The condition number of a full-rank matrix in the Euclidean norm is

$$
\kappa_2(A) =
\frac{\sigma_1}{\sigma_r},
$$

where:

- \(\sigma_1\) is the largest singular value,
- \(\sigma_r\) is the smallest positive singular value.

A large condition number indicates near singularity and numerical instability.

The singular values therefore measure how close a matrix is to losing rank.

## 57.17 Spectral Norm

The operator \(2\)-norm of a matrix is

$$
\|A\|_2 =
\max_{\|x\|_2=1}\|Ax\|_2.
$$

This equals the largest singular value:

$$
\|A\|_2=\sigma_1.
$$

Thus the largest singular value measures the maximum stretching factor of the transformation.

## 57.18 Frobenius Norm

The Frobenius norm is

$$
\|A\|_F =
\sqrt{
\sum_{i,j}|a_{ij}|^2
}.
$$

Using the SVD,

$$
\|A\|_F^2 =
\sum_{i=1}^r \sigma_i^2.
$$

Thus the singular values completely determine the Frobenius norm.

## 57.19 Principal Component Analysis

Principal component analysis, usually called PCA, is closely related to the SVD.

Suppose a data matrix \(X\) contains centered observations.

The covariance matrix is

$$
X^TX.
$$

The eigenvectors of \(X^TX\) are principal directions.

If

$$
X=U\Sigma V^T,
$$

then:

| Quantity | PCA interpretation |
|---|---|
| \(v_i\) | Principal direction |
| \(\sigma_i^2\) | Variance magnitude |
| \(u_i\sigma_i\) | Principal component coordinates |

Thus PCA is essentially the SVD of the centered data matrix.

## 57.20 Numerical Importance

The SVD is one of the most stable and informative matrix factorizations.

It is used for:

| Application | Purpose |
|---|---|
| Least squares | Stable solution |
| Rank estimation | Detect numerical rank |
| Compression | Low-rank approximation |
| PCA | Dimension reduction |
| Signal processing | Noise filtering |
| Machine learning | Feature extraction |
| Inverse problems | Regularization |
| Numerical analysis | Conditioning analysis |

Unlike many decompositions, the SVD exists for every matrix and reveals complete geometric information about the transformation.

## 57.21 Summary

Every matrix admits a singular value decomposition:

$$
A=U\Sigma V^*.
$$

The matrices \(U\) and \(V\) are unitary or orthogonal. The matrix \(\Sigma\) contains the nonnegative singular values.

The singular values are the square roots of the eigenvalues of

$$
A^*A.
$$

The SVD describes the action of a matrix geometrically as:

1. a unitary transformation,
2. a coordinate scaling,
3. another unitary transformation.

It reveals rank, conditioning, principal directions, and optimal low-rank approximations. It is one of the most powerful tools in linear algebra and numerical computation.