# Chapter 69. Minimal Polynomial

# Chapter 69. Minimal Polynomial

The minimal polynomial is the smallest monic polynomial that annihilates a matrix.

For a square matrix \(A\), a polynomial \(p(t)\) can be evaluated at \(A\) by replacing \(t\) with \(A\). For example, if

$$
p(t)=t^2-3t+2,
$$

then

$$
p(A)=A^2-3A+2I.
$$

The minimal polynomial of \(A\) is the monic polynomial \(m_A(t)\) of least degree such that

$$
m_A(A)=0.
$$

It records the essential algebraic structure of \(A\). The characteristic polynomial records all eigenvalues with algebraic multiplicity. The minimal polynomial records only the largest Jordan block size for each eigenvalue. It also determines whether a matrix is diagonalizable. A matrix is diagonalizable exactly when its minimal polynomial splits into distinct linear factors.

## 69.1 Polynomial Evaluation at a Matrix

Let

$$
p(t)=a_0+a_1t+a_2t^2+\cdots+a_kt^k.
$$

For a square matrix \(A\), define

$$
p(A)=a_0I+a_1A+a_2A^2+\cdots+a_kA^k.
$$

The identity matrix \(I\) is used for the constant term, since matrix addition requires all terms to be matrices of the same size.

For example, if

$$
p(t)=t^3-2t+5,
$$

then

$$
p(A)=A^3-2A+5I.
$$

A polynomial \(p\) is said to annihilate \(A\) if

$$
p(A)=0.
$$

The minimal polynomial is the simplest nonzero polynomial that annihilates \(A\).

## 69.2 Definition

Let \(A\) be an \(n\times n\) matrix over a field \(F\).

The minimal polynomial of \(A\) is the unique monic polynomial \(m_A(t)\in F[t]\) of least degree such that

$$
m_A(A)=0.
$$

Monic means that the leading coefficient is \(1\).

The same definition applies to a linear transformation

$$
T:V\to V.
$$

The minimal polynomial \(m_T(t)\) is the unique monic polynomial of least degree satisfying

$$
m_T(T)=0.
$$

The minimal polynomial belongs to the operator itself. If two matrices represent the same linear map in different bases, they have the same minimal polynomial.

## 69.3 Existence

A nonzero annihilating polynomial always exists for a matrix.

Let \(A\) be \(n\times n\). The vector space of all \(n\times n\) matrices has dimension \(n^2\). Therefore the \(n^2+1\) matrices

$$
I,A,A^2,\ldots,A^{n^2}
$$

are linearly dependent.

Hence there exist scalars

$$
c_0,c_1,\ldots,c_{n^2},
$$

not all zero, such that

$$
c_0I+c_1A+\cdots+c_{n^2}A^{n^2}=0.
$$

Thus the polynomial

$$
p(t)=c_0+c_1t+\cdots+c_{n^2}t^{n^2}
$$

annihilates \(A\).

Among all nonzero annihilating polynomials, choose one of smallest degree and scale it to be monic. This is the minimal polynomial.

## 69.4 Uniqueness

The minimal polynomial is unique.

Suppose \(p(t)\) and \(q(t)\) are both monic annihilating polynomials of least degree. Divide \(p\) by \(q\):

$$
p(t)=s(t)q(t)+r(t),
$$

where either \(r=0\) or

$$
\deg r<\deg q.
$$

Evaluate at \(A\):

$$
p(A)=s(A)q(A)+r(A).
$$

Since

$$
p(A)=0
$$

and

$$
q(A)=0,
$$

we get

$$
r(A)=0.
$$

If \(r\neq 0\), then \(r\) is an annihilating polynomial of smaller degree than \(q\), which is impossible. Thus

$$
r=0.
$$

So \(q\) divides \(p\). By symmetry, \(p\) divides \(q\). Since both are monic and have the same least degree,

$$
p=q.
$$

## 69.5 Divisibility Property

The minimal polynomial divides every polynomial that annihilates \(A\).

If

$$
p(A)=0,
$$

then divide \(p\) by \(m_A\):

$$
p(t)=q(t)m_A(t)+r(t),
$$

with

$$
\deg r<\deg m_A.
$$

Evaluate at \(A\):

$$
p(A)=q(A)m_A(A)+r(A).
$$

Since

$$
p(A)=0
$$

and

$$
m_A(A)=0,
$$

we obtain

$$
r(A)=0.
$$

By minimality, the only polynomial of degree smaller than \(m_A\) that annihilates \(A\) is the zero polynomial. Hence

$$
r=0.
$$

Therefore

$$
m_A(t)\mid p(t).
$$

This divisibility property characterizes the minimal polynomial. It is not merely one annihilating polynomial; it is the generator of all annihilating polynomials.

## 69.6 Relation to the Characteristic Polynomial

The characteristic polynomial is

$$
p_A(t)=\det(tI-A)
$$

or, by another sign convention,

$$
\det(A-tI).
$$

The Cayley-Hamilton theorem states that

$$
p_A(A)=0.
$$

Thus the characteristic polynomial annihilates \(A\). Since the minimal polynomial divides every annihilating polynomial, it follows that

$$
m_A(t)\mid p_A(t).
$$

Therefore the minimal polynomial is always a divisor of the characteristic polynomial. This is one common way to express the link between the minimal polynomial and the Cayley-Hamilton theorem.

## 69.7 Same Roots as the Characteristic Polynomial

Over an algebraically closed field, the roots of the minimal polynomial are exactly the eigenvalues of \(A\).

If \(\lambda\) is an eigenvalue, then there is a nonzero vector \(v\) such that

$$
Av=\lambda v.
$$

For any polynomial \(p\),

$$
p(A)v=p(\lambda)v.
$$

If \(p(A)=0\), then

$$
p(\lambda)v=0.
$$

Since \(v\neq 0\),

$$
p(\lambda)=0.
$$

Therefore every annihilating polynomial must vanish at every eigenvalue. In particular, the minimal polynomial has every eigenvalue as a root.

Conversely, since \(m_A\) divides the characteristic polynomial, every root of \(m_A\) is a root of the characteristic polynomial, hence an eigenvalue.

Thus the minimal and characteristic polynomials have the same distinct roots, though their multiplicities may differ.

## 69.8 Example: A Diagonal Matrix

Let

$$
A=
\begin{bmatrix}
2 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 5
\end{bmatrix}.
$$

The characteristic polynomial is

$$
p_A(t)=(t-2)^2(t-5).
$$

But the minimal polynomial is

$$
m_A(t)=(t-2)(t-5).
$$

Indeed,

$$
(A-2I)(A-5I)=0.
$$

The factor \(t-2\) appears only once in the minimal polynomial, even though \(2\) has algebraic multiplicity \(2\).

This happens because the matrix is diagonal. A diagonal matrix needs only one factor for each distinct eigenvalue.

## 69.9 Example: A Jordan Block

Let

$$
A=
\begin{bmatrix}
2 & 1 \\
0 & 2
\end{bmatrix}.
$$

Then

$$
A-2I=
\begin{bmatrix}
0 & 1 \\
0 & 0
\end{bmatrix}.
$$

This matrix is not zero. Hence

$$
(A-2I)\neq 0.
$$

But

$$
(A-2I)^2=0.
$$

Therefore the minimal polynomial is

$$
m_A(t)=(t-2)^2.
$$

The characteristic polynomial is also

$$
p_A(t)=(t-2)^2.
$$

This matrix is not diagonalizable, and the repeated factor in the minimal polynomial detects that failure.

## 69.10 Example: Two Jordan Blocks with the Same Eigenvalue

Suppose

$$
J=
J_3(4)\oplus J_2(4).
$$

The characteristic polynomial is

$$
p_J(t)=(t-4)^5.
$$

The minimal polynomial is determined by the largest Jordan block for the eigenvalue \(4\). Since the largest block has size \(3\),

$$
m_J(t)=(t-4)^3.
$$

The characteristic polynomial counts the total size of all Jordan blocks. The minimal polynomial records the largest block size.

## 69.11 General Jordan Form Rule

Suppose \(A\) has Jordan form consisting of Jordan blocks

$$
J_{k_1}(\lambda_1),\ldots,J_{k_r}(\lambda_r).
$$

For each eigenvalue \(\lambda\), let \(s_\lambda\) be the size of the largest Jordan block associated with \(\lambda\).

Then

$$
m_A(t)=\prod_{\lambda}(t-\lambda)^{s_\lambda}.
$$

The product is taken over the distinct eigenvalues.

By contrast, the characteristic polynomial is

$$
p_A(t)=\prod_{\lambda}(t-\lambda)^{a_\lambda},
$$

where \(a_\lambda\) is the algebraic multiplicity, the total size of all Jordan blocks for \(\lambda\).

Thus the two polynomials contain related but different information.

## 69.12 Diagonalization Criterion

A matrix \(A\) is diagonalizable over a field \(F\) if and only if its minimal polynomial splits over \(F\) into distinct linear factors.

That is,

$$
m_A(t)=(t-\lambda_1)(t-\lambda_2)\cdots(t-\lambda_k),
$$

with all \(\lambda_i\) distinct.

If a repeated factor appears, such as

$$
(t-\lambda)^2,
$$

then some Jordan block for \(\lambda\) has size at least \(2\), so the matrix is not diagonalizable.

This criterion is often more compact than checking eigenspace dimensions directly.

## 69.13 Diagonalizable Example

Let

$$
A=
\begin{bmatrix}
3 & 0 & 0 \\
0 & 3 & 0 \\
0 & 0 & -1
\end{bmatrix}.
$$

The characteristic polynomial is

$$
p_A(t)=(t-3)^2(t+1).
$$

The minimal polynomial is

$$
m_A(t)=(t-3)(t+1).
$$

This polynomial has no repeated factors. Therefore \(A\) is diagonalizable.

In fact, it is already diagonal.

## 69.14 Non-Diagonalizable Example

Let

$$
B=
\begin{bmatrix}
3 & 1 & 0 \\
0 & 3 & 0 \\
0 & 0 & -1
\end{bmatrix}.
$$

This matrix has one Jordan block of size \(2\) for eigenvalue \(3\), and one block of size \(1\) for eigenvalue \(-1\).

Its characteristic polynomial is

$$
p_B(t)=(t-3)^2(t+1).
$$

Its minimal polynomial is

$$
m_B(t)=(t-3)^2(t+1).
$$

The repeated factor

$$
(t-3)^2
$$

shows that \(B\) is not diagonalizable.

## 69.15 Computing the Minimal Polynomial

There are several ways to compute the minimal polynomial.

One method uses powers of the matrix. Search for the lowest-degree monic relation

$$
A^k+c_{k-1}A^{k-1}+\cdots+c_1A+c_0I=0.
$$

Another method uses eigenvalues and kernels. For each eigenvalue \(\lambda\), find the smallest exponent \(s\) such that

$$
\ker(A-\lambda I)^s
$$

equals the full generalized eigenspace for \(\lambda\). This \(s\) is the largest Jordan block size for \(\lambda\).

A third method uses known structure. If the matrix is diagonal, symmetric, Hermitian, or normal over \(\mathbb{C}\), then it is diagonalizable, so the minimal polynomial has only distinct linear factors.

## 69.16 Kernel Stabilization

Let

$$
N=A-\lambda I.
$$

The sequence of subspaces

$$
\ker N
\subseteq
\ker N^2
\subseteq
\ker N^3
\subseteq
\cdots
$$

is increasing.

On the generalized eigenspace for \(\lambda\), this sequence eventually stabilizes. The smallest exponent \(s\) at which it reaches the full generalized eigenspace is the largest Jordan block size for \(\lambda\).

Equivalently, \(s\) is the exponent of

$$
(t-\lambda)
$$

in the minimal polynomial.

The multiplicity of a root in the minimal polynomial is therefore controlled by the growth of these kernels.

## 69.17 Minimal Polynomial of a Projection

A projection satisfies

$$
P^2=P.
$$

Therefore

$$
P^2-P=0,
$$

so

$$
P(P-I)=0.
$$

Thus the minimal polynomial divides

$$
t(t-1).
$$

If \(P\neq 0\) and \(P\neq I\), then both eigenvalues \(0\) and \(1\) occur, and

$$
m_P(t)=t(t-1).
$$

Since the polynomial has distinct linear factors, every projection is diagonalizable over any field in which \(0\neq 1\).

## 69.18 Minimal Polynomial of an Involution

An involution satisfies

$$
A^2=I.
$$

Therefore

$$
A^2-I=0.
$$

So the minimal polynomial divides

$$
t^2-1=(t-1)(t+1).
$$

If the field has characteristic not equal to \(2\), these factors are distinct. Hence every involution is diagonalizable over such a field, provided the minimal polynomial splits there.

The eigenvalues of an involution are among

$$
1
\qquad
\text{and}
\qquad
-1.
$$

## 69.19 Minimal Polynomial of a Nilpotent Matrix

A matrix \(N\) is nilpotent if

$$
N^k=0
$$

for some positive integer \(k\).

The minimal polynomial of \(N\) has the form

$$
m_N(t)=t^s,
$$

where \(s\) is the smallest positive integer such that

$$
N^s=0.
$$

This integer \(s\) is called the index of nilpotency.

If \(N\) is in Jordan form, \(s\) is the size of the largest nilpotent Jordan block.

For example, if

$$
N=
\begin{bmatrix}
0 & 1 & 0 \\
0 & 0 & 1 \\
0 & 0 & 0
\end{bmatrix},
$$

then

$$
N^3=0
$$

but

$$
N^2\neq 0.
$$

Thus

$$
m_N(t)=t^3.
$$

## 69.20 Minimal Polynomial and Matrix Inverses

The minimal polynomial can express the inverse of an invertible matrix as a polynomial in the matrix.

Suppose

$$
m_A(t)=t^k+c_{k-1}t^{k-1}+\cdots+c_1t+c_0.
$$

If \(A\) is invertible, then \(0\) is not an eigenvalue, so

$$
c_0\neq 0.
$$

Since

$$
m_A(A)=0,
$$

we have

$$
A^k+c_{k-1}A^{k-1}+\cdots+c_1A+c_0I=0.
$$

Rearrange:

$$
c_0I=-A(A^{k-1}+c_{k-1}A^{k-2}+\cdots+c_1I).
$$

Multiply by \(c_0^{-1}\):

$$
A^{-1} =
-\frac{1}{c_0}
\left(
A^{k-1}+c_{k-1}A^{k-2}+\cdots+c_1I
\right).
$$

Thus the inverse is a polynomial in \(A\).

## 69.21 Minimal Polynomial and Cyclic Vectors

A vector \(v\) is called cyclic for \(A\) if

$$
v,Av,A^2v,\ldots,A^{n-1}v
$$

span the whole space.

If \(A\) has a cyclic vector, then the minimal polynomial and characteristic polynomial are equal.

This happens because the action of \(A\) on one vector already generates the entire space, so the first polynomial relation among the powers of \(A\) must have degree \(n\).

Companion matrices provide standard examples where the minimal polynomial equals the characteristic polynomial.

## 69.22 Minimal Polynomial of a Linear Transformation

Let

$$
T:V\to V
$$

be a linear transformation on a finite-dimensional vector space.

The minimal polynomial \(m_T(t)\) is the unique monic polynomial of least degree satisfying

$$
m_T(T)=0.
$$

If \(A\) is the matrix of \(T\) in some basis, then

$$
m_T(t)=m_A(t).
$$

Changing basis replaces \(A\) by a similar matrix

$$
B=P^{-1}AP.
$$

For any polynomial \(p\),

$$
p(B)=P^{-1}p(A)P.
$$

Thus

$$
p(B)=0
$$

if and only if

$$
p(A)=0.
$$

Therefore similar matrices have the same minimal polynomial.

## 69.23 What the Minimal Polynomial Does Not Determine

The minimal polynomial does not determine the matrix completely.

For example,

$$
A=
\begin{bmatrix}
2 & 0 \\
0 & 2
\end{bmatrix}
$$

has minimal polynomial

$$
t-2.
$$

A \(3\times 3\) scalar matrix

$$
B=
\begin{bmatrix}
2 & 0 & 0 \\
0 & 2 & 0 \\
0 & 0 & 2
\end{bmatrix}
$$

also has minimal polynomial

$$
t-2.
$$

The matrices have different sizes.

Even among matrices of the same size, the minimal polynomial may fail to determine all Jordan block multiplicities. It gives the largest block size for each eigenvalue, but not the number of smaller blocks.

To recover full Jordan structure, one needs more information, such as the dimensions of the kernels of powers of \(A-\lambda I\).

## 69.24 Summary

The minimal polynomial of a square matrix \(A\) is the unique monic polynomial of least degree satisfying

$$
m_A(A)=0.
$$

It divides every polynomial that annihilates \(A\), including the characteristic polynomial.

Over an algebraically closed field, it has the same distinct roots as the characteristic polynomial. Its exponent at each eigenvalue equals the size of the largest Jordan block for that eigenvalue.

The minimal polynomial gives a compact test for diagonalization: \(A\) is diagonalizable exactly when \(m_A(t)\) splits into distinct linear factors.

It is smaller than the characteristic polynomial in many cases, but often more structurally informative.