The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic equation.
If is an matrix and its characteristic polynomial is
then the theorem says
This means that if
then
The theorem is a basic bridge between determinants, eigenvalues, matrix powers, and minimal polynomials. Standard statements of the theorem say that the characteristic polynomial of an matrix annihilates that matrix.
70.1 Matrix Polynomials
Let
be a polynomial.
For a square matrix , define
The identity matrix appears in the constant term so that every term is an matrix.
For example, if
then
A polynomial annihilates if
The Cayley-Hamilton theorem says that the characteristic polynomial always annihilates the matrix.
70.2 Statement of the Theorem
Let , where is a field. Let
be the characteristic polynomial of .
If
then
This is the Cayley-Hamilton theorem.
It says that the matrix is a root of its own characteristic polynomial, in the sense of matrix polynomial evaluation.
70.3 A Two by Two Form
For a matrix
the characteristic polynomial is
Therefore the Cayley-Hamilton theorem gives
This compact identity holds for every matrix.
For example, if
then
and
So the theorem predicts
This example is a common concrete illustration of the theorem.
70.4 Direct Verification in a Two by Two Example
Let
Compute
Now compute
Then
Thus
So satisfies its characteristic equation.
70.5 Meaning of the Theorem
The theorem says that high powers of a matrix are not independent forever.
For an matrix, the characteristic polynomial gives a relation among
If
then
Thus every power with can be reduced to a linear combination of
This makes the theorem useful for matrix powers, recurrence relations, matrix functions, and inverse formulas.
70.6 Relation to the Minimal Polynomial
The minimal polynomial is the monic polynomial of least degree satisfying
The Cayley-Hamilton theorem says
Therefore the characteristic polynomial is an annihilating polynomial.
Since the minimal polynomial divides every annihilating polynomial, we get
This is one of the main consequences of Cayley-Hamilton: the minimal polynomial always divides the characteristic polynomial.
70.7 Eigenvalue Interpretation
Suppose
with
For any polynomial ,
If , then
because is an eigenvalue. Hence
for every eigenvector .
This observation suggests the theorem, but it does not prove it in general. Eigenvectors may not form a basis. Some matrices are defective. The Cayley-Hamilton theorem is stronger: it says as a matrix, even when there are not enough eigenvectors.
70.8 Proof Idea Using Diagonalization
If is diagonalizable, the theorem is easy.
Suppose
where
For any polynomial ,
The characteristic polynomial is
counting multiplicities.
Then
Each diagonal entry is zero, so
Therefore
This proves Cayley-Hamilton for diagonalizable matrices. The full theorem requires an argument that also covers defective matrices.
70.9 Proof Idea Using Jordan Form
Over , every square matrix has a Jordan form:
The characteristic polynomial of is the same as that of . Each Jordan block for eigenvalue has the form
where
The characteristic polynomial contains the factor
for that block. Therefore, when the characteristic polynomial is evaluated on the block, the nilpotent part is killed.
Since every block is killed, the whole Jordan matrix is killed:
Then
This proof shows the structural reason behind the theorem. The characteristic polynomial contains enough powers of each factor to kill every Jordan block.
70.10 Proof Idea Using the Adjugate
There is also a determinant-based proof.
For a square matrix ,
Apply this identity to
Then
The right side is
The adjugate matrix has polynomial entries in . Expanding both sides and comparing coefficients gives a matrix polynomial identity. Substituting correctly then yields
This adjugate proof is a standard route to the theorem, with care needed because one cannot naively substitute a matrix into every occurrence of the scalar variable before establishing the polynomial identity.
70.11 Why Naive Substitution Can Mislead
The characteristic polynomial is
One might try to prove the theorem by writing
This is not a valid argument.
The expression
means evaluating a scalar polynomial at the matrix :
It does not mean taking the determinant of a block-like expression obtained by replacing inside with a matrix.
The determinant expression first produces a scalar polynomial. Only after that polynomial is formed may it be evaluated at . This distinction matters in rigorous proofs.
70.12 Reducing Powers
Suppose
By Cayley-Hamilton,
So
Multiplying both sides by gives
Then reduce again using the same relation.
Repeating this process expresses every high power of as a linear combination of lower powers.
70.13 Example: Reducing Powers
Let
We found
Thus
To compute ,
Substitute:
Reduce :
Thus
So without multiplying three matrices directly, we obtain
Therefore
70.14 Formula for the Inverse
If is invertible, then Cayley-Hamilton gives a formula for as a polynomial in .
Let
Since
invertibility implies
By Cayley-Hamilton,
Factor out from all terms except the constant term:
Hence
Therefore
This gives an exact inverse formula, although it is usually not the best numerical method for computing inverses.
70.15 Example: Inverse from Cayley-Hamilton
For a matrix,
If is invertible, then
Rewrite:
Factor:
Thus
For
this becomes
Hence
This recovers the standard inverse formula for matrices.
70.16 Recurrence Relations
Cayley-Hamilton gives recurrence relations for matrix sequences.
If
then
Multiplying by gives
Thus the sequence
satisfies a linear recurrence whose coefficients come from the characteristic polynomial.
This is useful in discrete dynamical systems and linear recurrences.
70.17 Matrix Functions
Cayley-Hamilton also simplifies matrix functions.
If is a polynomial, one can divide by :
where
Evaluate at :
Since
we get
Thus every polynomial in is equivalent to a polynomial of degree less than .
This idea extends, with additional hypotheses, to analytic functions of matrices.
70.18 Determinant and Trace in the Two by Two Case
For a matrix, Cayley-Hamilton says
This identity shows how trace and determinant control the second power of .
The trace controls the coefficient of . The determinant controls the constant term.
For higher-dimensional matrices, the coefficients of the characteristic polynomial play analogous roles. They are built from determinant-like invariants and determine how reduces to lower powers.
70.19 Projection Example
Let be a projection:
Then
So is annihilated by
The Cayley-Hamilton theorem guarantees that the characteristic polynomial also annihilates .
If projects onto an -dimensional subspace of an -dimensional space, then its eigenvalues are with multiplicity , and with multiplicity . Hence
Cayley-Hamilton gives
This is true, but the smaller polynomial
already annihilates . This illustrates the difference between the characteristic polynomial and the minimal polynomial.
70.20 Nilpotent Example
Let be nilpotent with
If is , all eigenvalues are zero, so the characteristic polynomial is
Cayley-Hamilton gives
Thus every nilpotent matrix has nilpotency index at most .
The minimal polynomial may be
for some
Again, the minimal polynomial gives the sharp exponent, while the characteristic polynomial gives a universal exponent.
70.21 Consequences
The Cayley-Hamilton theorem has several important consequences.
| Consequence | Meaning |
|---|---|
| The characteristic polynomial annihilates | |
| The minimal polynomial divides the characteristic polynomial | |
| High powers reduce | for reduces to lower powers |
| Inverses become polynomials | If is invertible, is a polynomial in |
| Matrix recurrences | Powers of satisfy a linear recurrence |
| Nilpotent bound | If is nilpotent , then |
These consequences make the theorem both structural and computational.
70.22 Common Errors
The first common error is to confuse
with
The former is matrix polynomial evaluation. The latter is not the correct interpretation.
The second common error is to forget the identity matrix in the constant term. If
then
not
The third common error is to assume the characteristic polynomial is the smallest annihilating polynomial. It may not be. The smallest one is the minimal polynomial.
The fourth common error is to use Cayley-Hamilton as a numerical method for large matrix inverses. The theorem is exact and algebraic, but direct inverse computation through powers is usually unstable and inefficient.
70.23 Summary
The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic equation.
If
then
In expanded form, if
then
The theorem proves that the characteristic polynomial is an annihilating polynomial. It implies that the minimal polynomial divides the characteristic polynomial, that high powers of a matrix reduce to lower powers, and that invertible matrices have inverses expressible as polynomials in themselves.