Chapter 63. Diagonalization

Diagonalization is the process of replacing a matrix by a diagonal matrix through a change of basis.

A diagonal matrix is simple because it acts independently on each coordinate. If a matrix can be diagonalized, then its action becomes easy to describe, its powers become easy to compute, and its long-term behavior becomes easier to analyze.

The central idea is this: a matrix is diagonalizable when the space has a basis made of eigenvectors. In that basis, the matrix only rescales each coordinate. An $n \times n$ matrix is diagonalizable exactly when it has $n$ linearly independent eigenvectors.

63.1 Diagonal Matrices

A diagonal matrix has zero entries outside the main diagonal:

D= \begin{bmatrix} d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n \end{bmatrix}.

For a vector

x= \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix},

we have

Dx= \begin{bmatrix} d_1x_1 \\ d_2x_2 \\ \vdots \\ d_nx_n \end{bmatrix}.

Thus each coordinate is scaled independently.

The first coordinate is multiplied by $d_1$ . The second coordinate is multiplied by $d_2$ . In general, the $i$ -th coordinate is multiplied by $d_i$ .

Diagonal matrices are the simplest square matrices after scalar multiples of the identity.

63.2 Similarity

Two square matrices $A$ and $B$ are similar if there exists an invertible matrix $P$ such that

B=P^{-1}AP.

Similarity means that $A$ and $B$ represent the same linear transformation written in different bases.

The matrix $P$ changes coordinates from one basis to another. The matrix $P^{-1}$ changes them back.

A diagonalization of $A$ is a similarity relation in which $B$ is diagonal.

Thus $A$ is diagonalizable if there exists an invertible matrix $P$ and a diagonal matrix $D$ such that

P^{-1}AP=D.

Equivalently,

A=PDP^{-1}.

A square matrix is called diagonalizable when it is similar to a diagonal matrix.

63.3 Definition

Let $A$ be an $n \times n$ matrix over a field $F$ .

The matrix $A$ is diagonalizable over $F$ if there exist an invertible $n \times n$ matrix $P$ and a diagonal $n \times n$ matrix $D$ , both with entries in $F$ , such that

A=PDP^{-1}.

Equivalently,

P^{-1}AP=D.

The diagonal entries of $D$ are eigenvalues of $A$ . The columns of $P$ are corresponding eigenvectors of $A$ .

The field matters. A real matrix may fail to diagonalize over $\mathbb{R}$ , but diagonalize over $\mathbb{C}$ .

63.4 Why Eigenvectors Produce Diagonalization

Suppose $A$ has $n$ linearly independent eigenvectors

v_1,v_2,\ldots,v_n.

Suppose their eigenvalues are

\lambda_1,\lambda_2,\ldots,\lambda_n,

so that

Av_i=\lambda_i v_i

for each $i$ .

Form the matrix

P= \begin{bmatrix} | & | & & | \\ v_1 & v_2 & \cdots & v_n \\ | & | & & | \end{bmatrix}.

Since the vectors $v_1,\ldots,v_n$ are linearly independent, $P$ is invertible.

Now compute $AP$ . Multiplying $A$ by $P$ applies $A$ to each column:

AP= \begin{bmatrix} | & | & & | \\ Av_1 & Av_2 & \cdots & Av_n \\ | & | & & | \end{bmatrix}.

Using $Av_i=\lambda_i v_i$ ,

AP= \begin{bmatrix} | & | & & | \\ \lambda_1v_1 & \lambda_2v_2 & \cdots & \lambda_nv_n \\ | & | & & | \end{bmatrix}.

This can be written as

AP=PD,

where

D= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}.

Since $P$ is invertible,

A=PDP^{-1}.

This is the diagonalization of $A$ .

63.5 The Diagonalization Theorem

An $n \times n$ matrix $A$ is diagonalizable if and only if $A$ has $n$ linearly independent eigenvectors.

v_1,v_2,\ldots,v_n

are linearly independent eigenvectors with corresponding eigenvalues

\lambda_1,\lambda_2,\ldots,\lambda_n,

then

P= \begin{bmatrix} | & | & & | \\ v_1 & v_2 & \cdots & v_n \\ | & | & & | \end{bmatrix}

and

D= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}

satisfy

A=PDP^{-1}.

The eigenvectors must appear in $P$ in the same order as their eigenvalues appear in $D$ . This is the standard diagonalization theorem.

63.6 How to Diagonalize a Matrix

To diagonalize an $n \times n$ matrix $A$ , use the following procedure.

Step	Operation
1	Find the eigenvalues of $A$ .
2	Find a basis for each eigenspace.
3	Count the total number of independent eigenvectors.
4	If the total is $n$ , place these eigenvectors as columns of $P$ .
5	Place the matching eigenvalues on the diagonal of $D$ .
6	Write $A=PDP^{-1}$ .

If fewer than $n$ independent eigenvectors are available, then $A$ cannot be diagonalized.

63.7 Example: A Diagonalizable Matrix

Let

A= \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}.

The characteristic polynomial is

\det(A-\lambda I) = \det \begin{bmatrix} 2-\lambda & 1 \\ 1 & 2-\lambda \end{bmatrix}.

Thus

\det(A-\lambda I)=(2-\lambda)^2-1.

Expand:

(2-\lambda)^2-1 = \lambda^2-4\lambda+3.

Factor:

\lambda^2-4\lambda+3=(\lambda-1)(\lambda-3).

The eigenvalues are

\lambda=3 \qquad \text{and} \qquad \lambda=1.

For $\lambda=3$ ,

A-3I= \begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}.

Solving

(A-3I)v=0

gives

v_1= \begin{bmatrix} 1 \\ 1 \end{bmatrix}.

For $\lambda=1$ ,

A-I= \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}.

Solving

(A-I)v=0

gives

v_2= \begin{bmatrix} 1 \\ -1 \end{bmatrix}.

The two eigenvectors are linearly independent, so $A$ is diagonalizable.

Set

P= \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}

and

D= \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}.

Then

A=PDP^{-1}.

63.8 Checking the Diagonalization

We can check the relation by verifying

AP=PD.

Compute

AP= \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}.

This gives

AP= \begin{bmatrix} 3 & 1 \\ 3 & -1 \end{bmatrix}.

Now compute

PD= \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}.

This gives

PD= \begin{bmatrix} 3 & 1 \\ 3 & -1 \end{bmatrix}.

Thus

AP=PD.

Since $P$ is invertible,

A=PDP^{-1}.

63.9 Distinct Eigenvalues

If an $n \times n$ matrix has $n$ distinct eigenvalues, then it is diagonalizable.

This follows because eigenvectors corresponding to distinct eigenvalues are linearly independent.

Thus distinct eigenvalues give a simple sufficient condition for diagonalizability.

The converse is false. A matrix may be diagonalizable even when some eigenvalues are repeated.

For example,

A= \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}

has only one distinct eigenvalue, namely $2$ . Yet every nonzero vector is an eigenvector, so the matrix is diagonalizable.

63.10 Repeated Eigenvalues

Repeated eigenvalues require more care.

Suppose $\lambda$ is an eigenvalue with algebraic multiplicity $m$ . The eigenspace $E_\lambda$ may have dimension less than $m$ , equal to $m$ , but never greater than $m$ .

For diagonalization, the sum of all eigenspace dimensions must equal $n$ :

\sum_{\lambda}\dim E_\lambda=n.

If this equality holds, then the matrix is diagonalizable.

If it fails, then the matrix is defective.

This criterion is one of the most useful tests for diagonalizability: an $n \times n$ matrix is diagonalizable exactly when the dimensions of its eigenspaces add to $n$ .

63.11 Example: A Repeated Eigenvalue That Diagonalizes

Let

A= \begin{bmatrix} 4 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & 7 \end{bmatrix}.

The eigenvalues are

4,4,7.

The eigenspace for $\lambda=4$ is

E_4= \operatorname{span} \left\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right\}.

The eigenspace for $\lambda=7$ is

E_7= \operatorname{span} \left\{ \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \right\}.

The dimensions add to

\dim E_4+\dim E_7=2+1=3.

Therefore the matrix is diagonalizable.

In fact, it is already diagonal.

63.12 Example: A Repeated Eigenvalue That Does Not Diagonalize

Let

B= \begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix}.

The characteristic polynomial is

(2-\lambda)^2.

Thus $\lambda=2$ has algebraic multiplicity $2$ .

Now compute the eigenspace:

B-2I= \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}.

Solving

(B-2I)v=0

gives

y=0.

Therefore

E_2= \operatorname{span} \left\{ \begin{bmatrix} 1 \\ 0 \end{bmatrix} \right\}.

The eigenspace has dimension $1$ , but the matrix is $2\times 2$ . There are not enough independent eigenvectors to form a basis.

Therefore $B$ is not diagonalizable.

63.13 Diagonalization as a Change of Coordinates

Diagonalization is best understood as a change of coordinates.

In the standard basis, the matrix $A$ may mix coordinates. In an eigenbasis, it does not mix them.

Suppose

A=PDP^{-1}.

To compute $Ax$ , one may view the operation in three stages:

Stage	Operation	Meaning
1	$P^{-1}x$	Express $x$ in the eigenvector basis
2	$D(P^{-1}x)$	Scale each eigen-coordinate
3	$P[D(P^{-1}x)]$	Return to the original basis

Thus

Ax=PDP^{-1}x.

The matrix $P^{-1}$ changes into eigen-coordinates. The diagonal matrix $D$ performs independent scaling. The matrix $P$ changes back.

63.14 Powers of a Diagonalizable Matrix

One major use of diagonalization is computing powers.

A=PDP^{-1},

then

A^2=(PDP^{-1})(PDP^{-1}).

Since

P^{-1}P=I,

we get

A^2=PD^2P^{-1}.

By induction,

A^k=PD^kP^{-1}.

For a diagonal matrix,

D^k= \begin{bmatrix} \lambda_1^k & 0 & \cdots & 0 \\ 0 & \lambda_2^k & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n^k \end{bmatrix}.

Thus powers of $A$ reduce to powers of its eigenvalues. This is a standard application of diagonalization.

63.15 Example: Computing Powers

Let

A= \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}.

We found

A=PDP^{-1},

where

P= \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}, \qquad D= \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}.

The inverse of $P$ is

P^{-1} = \frac{1}{2} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}.

Therefore

A^k = P \begin{bmatrix} 3^k & 0 \\ 0 & 1 \end{bmatrix} P^{-1}.

Compute:

A^k = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \begin{bmatrix} 3^k & 0 \\ 0 & 1 \end{bmatrix} \frac{1}{2} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}.

First multiply the first two matrices:

\begin{bmatrix} 3^k & 1 \\ 3^k & -1 \end{bmatrix}.

Then multiply by $P^{-1}$ :

A^k= \frac{1}{2} \begin{bmatrix} 3^k+1 & 3^k-1 \\ 3^k-1 & 3^k+1 \end{bmatrix}.

This gives a closed formula for every positive integer $k$ .

63.16 Matrix Functions

Diagonalization also simplifies functions of matrices.

If a function $f$ can be applied to the eigenvalues, then for a diagonalizable matrix

A=PDP^{-1},

we define

f(A)=P f(D) P^{-1},

where

f(D)= \begin{bmatrix} f(\lambda_1) & 0 & \cdots & 0 \\ 0 & f(\lambda_2) & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & f(\lambda_n) \end{bmatrix}.

Important examples include:

Matrix function	Use
$A^k$	Discrete dynamical systems
$A^{-1}$	Solving linear systems
$e^A$	Differential equations
$\sqrt{A}$	Matrix analysis
$\log A$	Lie theory and numerical analysis

For example, if

A=PDP^{-1},

then

e^A=Pe^DP^{-1}.

Since $e^D$ is diagonal with entries $e^{\lambda_i}$ , the computation becomes much simpler.

63.17 Diagonalization and Difference Equations

Consider a discrete dynamical system

x_{k+1}=Ax_k.

Then

x_k=A^kx_0.

If $A$ is diagonalizable, then

x_k=PD^kP^{-1}x_0.

The behavior of $x_k$ is controlled by the powers of the eigenvalues.

If $|\lambda_i|<1$ , the corresponding component decays.

If $|\lambda_i|>1$ , the corresponding component grows.

If $|\lambda_i|=1$ , the corresponding component persists or oscillates.

Diagonalization therefore separates the system into independent modes.

63.18 Orthogonal Diagonalization

Some matrices diagonalize in a stronger way.

A real symmetric matrix can be diagonalized using an orthogonal matrix:

A=QDQ^T,

where

Q^TQ=I.

The columns of $Q$ are orthonormal eigenvectors.

This is called orthogonal diagonalization. It is stronger than ordinary diagonalization because

Q^{-1}=Q^T.

Real symmetric matrices are always orthogonally diagonalizable. More generally, normal complex matrices are unitarily diagonalizable.

63.19 Diagonalization over Different Fields

Diagonalizability depends on the field.

Consider the real matrix

R= \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}.

This matrix rotates the plane by $90^\circ$ . Its characteristic polynomial is

\lambda^2+1.

Over $\mathbb{R}$ , this polynomial has no roots. Therefore $R$ has no real eigenvectors and cannot be diagonalized over $\mathbb{R}$ .

Over $\mathbb{C}$ , the roots are

i \qquad \text{and} \qquad -i.

The matrix has two complex eigenvectors and can be diagonalized over $\mathbb{C}$ .

When discussing diagonalization, one must specify the scalar field.

63.20 Diagonalization of Linear Transformations

Let

T:V\to V

be a linear transformation on a finite-dimensional vector space.

The transformation $T$ is diagonalizable if there exists a basis of $V$ consisting of eigenvectors of $T$ .

If such a basis exists, then the matrix of $T$ in that basis is diagonal.

Thus diagonalization is fundamentally a statement about bases, not about a single array of numbers.

A matrix diagonalizes when we can choose coordinates in which the transformation acts by independent scalar multiplication.

63.21 Common Errors

The first common error is to assume that every matrix is diagonalizable. Some matrices do not have enough independent eigenvectors.

The second common error is to place eigenvalues in $D$ in an order that does not match the eigenvectors in $P$ . The order must agree column by column.

The third common error is to confuse algebraic multiplicity with geometric multiplicity. Repeated roots of the characteristic polynomial do not automatically provide enough eigenvectors.

The fourth common error is to ignore the field. A matrix may diagonalize over $\mathbb{C}$ but not over $\mathbb{R}$ .

The fifth common error is to write $A=P^{-1}DP$ instead of $A=PDP^{-1}$ . If $P$ has eigenvectors as columns, the correct formula is

A=PDP^{-1}.

63.22 Summary

Diagonalization expresses a matrix in the form

A=PDP^{-1},

where $D$ is diagonal and $P$ is invertible.

The columns of $P$ are eigenvectors of $A$ . The diagonal entries of $D$ are the corresponding eigenvalues.

An $n\times n$ matrix is diagonalizable if and only if it has $n$ linearly independent eigenvectors. Equivalently, the dimensions of its eigenspaces add to $n$ .

Diagonalization changes coordinates into an eigenvector basis. In that basis, the linear transformation acts by independent scaling. This makes powers, matrix functions, dynamical systems, and spectral analysis much simpler.