Chapter 42. Similarity Transformations

A similarity transformation changes the matrix representation of a linear operator without changing the operator itself. It is the algebraic form of a change of basis.

Let (A) be an (n\times n) matrix, and let (P) be an invertible (n\times n) matrix. The matrix

$$ B=P^{-1}AP $$

is called similar to (A). The transformation

$$ A\mapsto P^{-1}AP $$

is called a similarity transformation, or conjugation by (P). Similar matrices represent the same linear operator written in different bases.

42.1 Linear Operators and Bases

Similarity applies to linear operators, not arbitrary maps between different spaces.

A linear operator is a linear map

$$ T:V\to V $$

from a vector space to itself.

If (V) is finite-dimensional and a basis (B) is chosen, then (T) has a matrix representation

$$ [T]_B. $$

If a different basis (C) is chosen, then (T) has another matrix representation

$$ [T]_C. $$

These two matrices may look different, but they describe the same operator. Similarity is the relation between them.

42.2 Change of Basis

Let

$$ B=(v_1,\ldots,v_n) $$

and

$$ C=(w_1,\ldots,w_n) $$

be two ordered bases of (V).

Let (P) be the change-of-coordinates matrix from (C)-coordinates to (B)-coordinates:

$$ [v]_B=P[v]_C. $$

Then

$$ [v]_C=P^{-1}[v]_B. $$

Suppose

$$ A=[T]_B. $$

This means

$$ [T(v)]_B=A[v]_B. $$

Using the change of coordinates,

$$ [T(v)]_C=P^{-1}[T(v)]_B. $$

Substitute the action of (A):

$$ [T(v)]_C=P^{-1}A[v]_B. $$

Since

$$ [v]_B=P[v]_C, $$

we get

$$ [T(v)]_C=P^{-1}AP[v]_C. $$

Therefore

$$ [T]_C=P^{-1}AP. $$

This is the similarity formula.

42.3 Meaning of the Formula

The expression

$$ P^{-1}AP $$

has three steps.

Starting with coordinates in the new basis (C), first multiply by (P). This converts the vector into old basis coordinates.

Then multiply by (A). This applies the operator in the old coordinate system.

Finally multiply by (P^{-1}). This converts the result back into the new basis.

Thus

$$ P^{-1}AP $$

means:

$$ \text{new coordinates} \to \text{old coordinates} \to \text{apply operator} \to \text{new coordinates}. $$

The operator has not changed. Only its coordinate description has changed.

42.4 Similar Matrices

Two square matrices (A) and (B) are similar if there exists an invertible matrix (P) such that

$$ B=P^{-1}AP. $$

The matrix (P) is the change-of-basis matrix.

Similarity is only defined for square matrices of the same size. This is because a linear operator has the same domain and codomain, so the same vector space is being described with different bases.

The notation

$$ A\sim B $$

is often used to mean that (A) and (B) are similar.

42.5 Similarity Is an Equivalence Relation

Similarity is reflexive, symmetric, and transitive.

It is reflexive because

$$ A=I^{-1}AI. $$

So every square matrix is similar to itself.

It is symmetric because if

$$ B=P^{-1}AP, $$

then

$$ A=PBP^{-1}. $$

Equivalently,

$$ A=(P^{-1})^{-1}B(P^{-1}). $$

Thus (A) is similar to (B).

It is transitive because if

$$ B=P^{-1}AP $$

and

$$ C=Q^{-1}BQ, $$

then

$$ C=Q^{-1}(P^{-1}AP)Q=(PQ)^{-1}A(PQ). $$

Thus (C) is similar to (A).

Therefore similarity partitions square matrices into equivalence classes. Each class consists of all matrices representing the same operator in different bases.

42.6 Example

Let

$$ A= \begin{bmatrix} 2 & 1\ 0 & 3 \end{bmatrix} $$

and let

$$ P= \begin{bmatrix} 1 & 1\ 0 & 1 \end{bmatrix}. $$

Then

$$ P^{-1}= \begin{bmatrix} 1 & -1\ 0 & 1 \end{bmatrix}. $$

Compute

$$ B=P^{-1}AP. $$

First,

$$ AP= \begin{bmatrix} 2 & 1\ 0 & 3 \end{bmatrix} \begin{bmatrix} 1 & 1\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 3\ 0 & 3 \end{bmatrix}. $$

Then

$$ P^{-1}AP= \begin{bmatrix} 1 & -1\ 0 & 1 \end{bmatrix} \begin{bmatrix} 2 & 3\ 0 & 3 \end{bmatrix} = \begin{bmatrix} 2 & 0\ 0 & 3 \end{bmatrix}. $$

$$ A $$

is similar to

$$ B= \begin{bmatrix} 2 & 0\ 0 & 3 \end{bmatrix}. $$

The operator has not changed. In the new basis, its matrix is diagonal.

42.7 Why Similarity Matters

Similarity is useful because one matrix representation may be easier to understand than another.

A complicated matrix may become diagonal in a suitable basis. A matrix that cannot be diagonalized may still become Jordan form or rational canonical form. These simpler forms reveal structure that is hard to see in the original coordinates.

The central question is:

Given a matrix (A), can we choose a basis in which the same operator has a simpler matrix?

Similarity is the mathematical language for that question.

42.8 Diagonalization as Similarity

A matrix (A) is diagonalizable if it is similar to a diagonal matrix.

That is, there exists an invertible matrix (P) and a diagonal matrix (D) such that

$$ D=P^{-1}AP. $$

Equivalently,

$$ A=PDP^{-1}. $$

The columns of (P) are eigenvectors of (A). The diagonal entries of (D) are the corresponding eigenvalues.

$$ P= \begin{bmatrix} | & | & & |\ v_1 & v_2 & \cdots & v_n\ | & | & & | \end{bmatrix}, $$

and

$$ Av_i=\lambda_i v_i, $$

then

$$ AP=PD. $$

Multiplying on the left by (P^{-1}), we obtain

$$ P^{-1}AP=D. $$

Thus diagonalization is a similarity transformation into an eigenvector basis.

42.9 Invariants Under Similarity

Similar matrices share properties that belong to the underlying operator rather than to a particular basis.

$$ B=P^{-1}AP, $$

then (A) and (B) have the same rank, determinant, trace, characteristic polynomial, eigenvalues, and algebraic multiplicities.

These are called similarity invariants.

Invariant	Reason
Rank	Multiplication by invertible matrices preserves rank
Determinant	(\det(P^{-1}AP)=\det(A))
Trace	(\operatorname{tr}(P^{-1}AP)=\operatorname{tr}(A))
Characteristic polynomial	(\det(\lambda I-P^{-1}AP)=\det(\lambda I-A))
Eigenvalues	Roots of the characteristic polynomial
Minimal polynomial	Polynomial relations are preserved

Similarity invariants help determine whether two matrices can represent the same operator in different bases.

42.10 Determinant Is Preserved

Let

$$ B=P^{-1}AP. $$

Then

$$ \det(B)=\det(P^{-1}AP). $$

Using multiplicativity of determinant,

$$ \det(B)=\det(P^{-1})\det(A)\det(P). $$

Since

$$ \det(P^{-1})=\frac{1}{\det(P)}, $$

we get

$$ \det(B)=\det(A). $$

Thus similar matrices have the same determinant.

The determinant is therefore a property of the linear operator, not merely of one coordinate representation.

42.11 Trace Is Preserved

The trace is also preserved by similarity.

Using the identity

$$ \operatorname{tr}(XY)=\operatorname{tr}(YX) $$

for square matrices of compatible size, we have

$$ \operatorname{tr}(P^{-1}AP) = \operatorname{tr}(APP^{-1}) = \operatorname{tr}(A). $$

Thus

$$ \operatorname{tr}(B)=\operatorname{tr}(A). $$

The trace is the sum of diagonal entries, but its value is independent of basis for a linear operator.

42.12 Characteristic Polynomial Is Preserved

Let

$$ B=P^{-1}AP. $$

The characteristic polynomial of (B) is

$$ \det(\lambda I-B). $$

Substitute (B=P^{-1}AP):

$$ \det(\lambda I-P^{-1}AP). $$

Since

$$ \lambda I=P^{-1}(\lambda I)P, $$

we have

$$ \lambda I-P^{-1}AP = P^{-1}(\lambda I-A)P. $$

Therefore

$$ \det(\lambda I-B) = \det(P^{-1}(\lambda I-A)P). $$

Using determinant multiplicativity,

$$ \det(\lambda I-B) = \det(P^{-1})\det(\lambda I-A)\det(P). $$

Hence

$$ \det(\lambda I-B)=\det(\lambda I-A). $$

So similar matrices have the same characteristic polynomial and the same eigenvalues.

42.13 Eigenvectors Under Similarity

Eigenvalues are preserved by similarity, but eigenvectors change coordinates.

Suppose

$$ Av=\lambda v. $$

Let

$$ B=P^{-1}AP. $$

Set

$$ w=P^{-1}v. $$

Then

$$ Bw=P^{-1}AP(P^{-1}v)=P^{-1}Av=P^{-1}(\lambda v)=\lambda P^{-1}v=\lambda w. $$

Thus (w) is an eigenvector of (B) with the same eigenvalue.

The eigenvector has changed because the coordinate system has changed. The eigendirection as part of the abstract operator remains the same.

42.14 Similarity and Powers

Similarity behaves well with powers.

$$ B=P^{-1}AP, $$

then

$$ B^2=(P^{-1}AP)(P^{-1}AP)=P^{-1}A^2P. $$

By induction,

$$ B^k=P^{-1}A^kP $$

for every integer (k\geq 0).

If (A) is invertible, the formula also holds for negative integers:

$$ B^{-1}=P^{-1}A^{-1}P. $$

Thus powers of similar matrices remain similar.

This matters in difference equations, Markov chains, iterative methods, and matrix functions.

42.15 Similarity and Polynomials in a Matrix

Let

$$ p(t)=a_0+a_1t+\cdots+a_kt^k $$

be a polynomial. If

$$ B=P^{-1}AP, $$

then

$$ p(B)=P^{-1}p(A)P. $$

This follows from the power formula and linearity:

$$ p(B)=a_0I+a_1B+\cdots+a_kB^k. $$

Substitute

$$ B^j=P^{-1}A^jP. $$

Then

$$ p(B)=P^{-1}(a_0I+a_1A+\cdots+a_kA^k)P. $$

$$ p(B)=P^{-1}p(A)P. $$

Consequently, polynomial identities are preserved under similarity. If

$$ p(A)=0, $$

then

$$ p(B)=0. $$

This explains why the minimal polynomial is a similarity invariant.

42.16 Similarity and Matrix Functions

Many functions of matrices are defined by power series or polynomial approximation. For such functions, similarity behaves naturally.

For example, the matrix exponential satisfies

$$ e^{B}=P^{-1}e^{A}P $$

whenever

$$ B=P^{-1}AP. $$

Indeed,

$$ e^A=I+A+\frac{A^2}{2!}+\frac{A^3}{3!}+\cdots. $$

Using the power formula,

$$ e^B = I+B+\frac{B^2}{2!}+\frac{B^3}{3!}+\cdots = P^{-1}e^AP. $$

Thus changing basis before computing a matrix function gives the same result as computing the function and then changing basis.

42.17 Similarity Versus Equivalence

Similarity should be distinguished from matrix equivalence.

Two (m\times n) matrices (A) and (B) are equivalent if there are invertible matrices (P) and (Q) such that

$$ B=PAQ. $$

Equivalence allows different changes of basis in the domain and codomain. It applies to linear maps

$$ T:V\to W $$

between possibly different spaces.

Similarity has the special form

$$ B=P^{-1}AP. $$

It uses the same change of basis on both sides, because the domain and codomain are the same vector space.

Equivalence classifies linear maps by rank. Similarity classifies linear operators by deeper structure, including eigenvalues and canonical forms.

42.18 Similarity Versus Congruence

Similarity should also be distinguished from congruence.

A congruence transformation has the form

$$ B=P^TAP $$

over the real numbers, or

$$ B=P^*AP $$

over the complex numbers.

Congruence arises naturally for bilinear forms and quadratic forms. Similarity arises naturally for linear operators.

The difference matters. Similarity preserves eigenvalues. Congruence generally does not. Congruence preserves properties such as rank and inertia for symmetric forms over the real numbers.

Thus the correct transformation law depends on the object being represented.

42.19 Orthogonal Similarity

If the change-of-basis matrix (Q) is orthogonal, then

$$ B=Q^TAQ $$

because

$$ Q^{-1}=Q^T. $$

This is called orthogonal similarity.

Orthogonal similarity corresponds to changing from one orthonormal basis to another. It is especially important in numerical linear algebra because orthogonal transformations preserve lengths and are numerically stable.

For complex vector spaces, the analogous notion is unitary similarity:

$$ B=U^*AU, $$

where (U) is unitary.

Orthogonal and unitary similarities are more restrictive than general similarity, but they preserve additional metric structure.

42.20 Canonical Forms

Similarity leads to canonical forms.

A canonical form is a distinguished representative of a similarity class. It gives a standard matrix that represents the operator as simply as possible.

Important canonical forms include:

Form	Purpose
Diagonal form	Best case, basis of eigenvectors
Jordan form	Describes generalized eigenvectors
Rational canonical form	Works over arbitrary fields
Real canonical form	Handles complex eigenvalues over (\mathbb{R})
Schur form	Uses unitary similarity, useful numerically

Not every matrix is diagonalizable. But every matrix over an algebraically closed field has a Jordan form. Every matrix over any field has a rational canonical form.

Canonical forms turn the classification of operators into the classification of similarity classes.

42.21 Example: Same Operator, Different Basis

Let (T:\mathbb{R}^2\to\mathbb{R}^2) be the operator with standard matrix

$$ A= \begin{bmatrix} 2 & 1\ 0 & 3 \end{bmatrix}. $$

Let the new basis be

$$ C= \left( \begin{bmatrix} 1\ 0 \end{bmatrix}, \begin{bmatrix} 1\ 1 \end{bmatrix} \right). $$

The change-of-basis matrix from (C)-coordinates to standard coordinates is

$$ P= \begin{bmatrix} 1 & 1\ 0 & 1 \end{bmatrix}. $$

As computed earlier,

$$ [T]_C=P^{-1}AP= \begin{bmatrix} 2 & 0\ 0 & 3 \end{bmatrix}. $$

In the standard basis, the operator has an upper triangular matrix. In the basis (C), it is diagonal.

The diagonal form shows that the new basis vectors are eigenvectors.

42.22 Nonexample

The matrices

$$ A= \begin{bmatrix} 1 & 0\ 0 & 2 \end{bmatrix} $$

and

$$ B= \begin{bmatrix} 1 & 0\ 0 & 3 \end{bmatrix} $$

are not similar.

Their traces are different:

$$ \operatorname{tr}(A)=3, \qquad \operatorname{tr}(B)=4. $$

Since trace is preserved under similarity, no invertible matrix (P) can satisfy

$$ B=P^{-1}AP. $$

Their determinants are also different:

$$ \det(A)=2, \qquad \det(B)=3. $$

Either invariant is enough to rule out similarity.

42.23 Similarity as Coordinate Independence

Similarity expresses coordinate independence.

A matrix often appears to be the primary object, but in many settings the primary object is the linear operator. The matrix is only the operator written in a basis.

When the basis changes, the matrix changes by

$$ A\mapsto P^{-1}AP. $$

The quantities that survive this change are intrinsic. They belong to the operator itself.

This viewpoint explains why trace, determinant, eigenvalues, characteristic polynomial, minimal polynomial, rank, and canonical form are central. They do not depend on arbitrary coordinate choices.

42.24 Summary

Two square matrices (A) and (B) are similar if there exists an invertible matrix (P) such that

$$ B=P^{-1}AP. $$

Similarity is the matrix form of change of basis for a linear operator.

$$ A=[T]_B, $$

and (P) converts new coordinates to old coordinates, then

$$ [T]_C=P^{-1}AP. $$

Similar matrices represent the same linear operator in different bases.

Similarity preserves rank, determinant, trace, characteristic polynomial, eigenvalues, minimal polynomial, and many other structural properties. It also controls diagonalization and canonical forms.

The main idea is simple: changing coordinates may change the entries of a matrix, but it does not change the operator.