Chapter 72. Matrix Functions

A matrix function is a rule that applies a scalar function to a square matrix.

For a scalar $x$ , expressions such as

x^2,\qquad e^x,\qquad \sqrt{x},\qquad \log x

are ordinary functions of one variable. For a square matrix $A$ , one may define analogous expressions:

A^2,\qquad e^A,\qquad A^{1/2},\qquad \log A.

Matrix functions are important because they let us transfer scalar functions into linear algebra. They appear in differential equations, Markov processes, control theory, numerical analysis, quantum mechanics, optimization, statistics, and graph theory. The matrix exponential, for example, is defined by a power series and is used to solve linear systems of differential equations.

72.1 Polynomial Functions of Matrices

The simplest matrix functions are polynomial functions.

Let

p(t)=a_0+a_1t+a_2t^2+\cdots+a_kt^k.

For a square matrix $A$ , define

p(A)=a_0I+a_1A+a_2A^2+\cdots+a_kA^k.

The identity matrix $I$ appears in the constant term.

For example, if

p(t)=t^3-2t+5,

then

p(A)=A^3-2A+5I.

This definition is purely algebraic. It uses only matrix addition, scalar multiplication, and matrix multiplication.

72.2 Powers as Matrix Functions

The function

f(t)=t^k

gives the matrix function

f(A)=A^k.

Powers of matrices are basic in discrete dynamical systems. If

x_{m+1}=Ax_m,

then

x_m=A^m x_0.

Thus the behavior of the system is controlled by the powers of $A$ .

When $A$ is diagonalizable, powers are especially simple. If

A=PDP^{-1},

then

A^k=PD^kP^{-1}.

Since $D$ is diagonal, $D^k$ is obtained by raising each diagonal entry to the $k$ -th power.

72.3 Functions of Diagonal Matrices

Let

D= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}.

If $f$ is a scalar function defined at each $\lambda_i$ , define

f(D)= \begin{bmatrix} f(\lambda_1) & 0 & \cdots & 0 \\ 0 & f(\lambda_2) & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & f(\lambda_n) \end{bmatrix}.

Thus a function of a diagonal matrix is obtained by applying the function entry by entry to the diagonal.

For example,

e^D= \begin{bmatrix} e^{\lambda_1} & 0 & \cdots & 0 \\ 0 & e^{\lambda_2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & e^{\lambda_n} \end{bmatrix}.

This diagonal rule is the model for the general theory.

72.4 Functions of Diagonalizable Matrices

Suppose $A$ is diagonalizable:

A=PDP^{-1}.

Then define

f(A)=Pf(D)P^{-1}.

This definition says: change to an eigenvector basis, apply $f$ to each eigenvalue, then change back.

D=\operatorname{diag}(\lambda_1,\ldots,\lambda_n),

then

f(D)=\operatorname{diag}(f(\lambda_1),\ldots,f(\lambda_n)).

Therefore

f(A)=P \operatorname{diag}(f(\lambda_1),\ldots,f(\lambda_n)) P^{-1}.

For polynomial functions, this agrees with direct polynomial evaluation. If $A=PDP^{-1}$ , then $p(A)=Pp(D)P^{-1}$ , and $p(D)$ is obtained by applying $p$ to the diagonal entries.

72.5 Example: A Matrix Square

Let

A= \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}.

This matrix has eigenvalues

3 \qquad \text{and} \qquad 1.

One diagonalization is

A=PDP^{-1},

where

P= \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}, \qquad D= \begin{bmatrix} 3 & 0 \\ 0 & 1 \end{bmatrix}.

Then

A^k=PD^kP^{-1}.

Since

D^k= \begin{bmatrix} 3^k & 0 \\ 0 & 1 \end{bmatrix},

we get

A^k= P \begin{bmatrix} 3^k & 0 \\ 0 & 1 \end{bmatrix} P^{-1}.

Thus powers of $A$ reduce to powers of its eigenvalues.

72.6 Matrix Exponential

The matrix exponential is one of the most important matrix functions.

For a square matrix $A$ , define

e^A = \sum_{k=0}^{\infty}\frac{A^k}{k!}.

That is,

e^A = I+A+\frac{A^2}{2!}+\frac{A^3}{3!}+\cdots.

This series always converges for real or complex square matrices, so the matrix exponential is well-defined.

If $A$ is diagonalizable and

A=PDP^{-1},

then

e^A=Pe^DP^{-1}.

D=\operatorname{diag}(\lambda_1,\ldots,\lambda_n),

then

e^D=\operatorname{diag}(e^{\lambda_1},\ldots,e^{\lambda_n}).

72.7 Matrix Exponential and Differential Equations

The matrix exponential solves constant-coefficient linear systems.

Consider

x'(t)=Ax(t),

with initial condition

x(0)=x_0.

The solution is

x(t)=e^{tA}x_0.

This is the matrix analogue of the scalar equation

x'(t)=ax(t),

whose solution is

x(t)=e^{at}x(0).

The matrix exponential is therefore the natural evolution operator for linear differential equations. It is commonly characterized as the solution operator for such systems.

72.8 Exponential of a Diagonalizable Matrix

Let

A=PDP^{-1},

where

D=\operatorname{diag}(\lambda_1,\ldots,\lambda_n).

Then

e^{tA}=Pe^{tD}P^{-1}.

Since

e^{tD} = \operatorname{diag} (e^{t\lambda_1},\ldots,e^{t\lambda_n}),

we have

e^{tA} = P \operatorname{diag} (e^{t\lambda_1},\ldots,e^{t\lambda_n}) P^{-1}.

This formula separates the solution into independent modes. Each eigenvalue contributes a scalar exponential factor.

If $\operatorname{Re}(\lambda_i)<0$ , that mode decays.

If $\operatorname{Re}(\lambda_i)>0$ , that mode grows.

If $\operatorname{Re}(\lambda_i)=0$ , that mode persists or oscillates.

72.9 Exponential of a Nilpotent Matrix

A matrix $N$ is nilpotent if

N^m=0

for some positive integer $m$ .

For such a matrix, the exponential series terminates:

e^N = I+N+\frac{N^2}{2!}+\cdots+\frac{N^{m-1}}{(m-1)!}.

All higher powers vanish.

For example, let

N= \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}.

Then

N^2=0.

Therefore

e^N=I+N = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}.

Nilpotent matrices explain the polynomial factors that occur in exponentials of Jordan blocks.

72.10 Functions of Jordan Blocks

Let

J=\lambda I+N

be a Jordan block, where $N$ is nilpotent.

For a polynomial or analytic function $f$ , the function of $J$ is obtained from the finite Taylor expansion

f(J) = f(\lambda)I + f'(\lambda)N + \frac{f''(\lambda)}{2!}N^2 + \cdots + \frac{f^{(k-1)}(\lambda)}{(k-1)!}N^{k-1}.

Here $k$ is the size of the Jordan block.

For the exponential function,

e^J=e^{\lambda I+N}.

Since $\lambda I$ commutes with $N$ ,

e^J=e^\lambda e^N.

This is the standard way to compute matrix functions through Jordan form. Matrix functions for Jordan blocks involve derivatives of the scalar function at the eigenvalue.

72.11 Matrix Functions and the Minimal Polynomial

The minimal polynomial controls matrix functions algebraically.

Suppose $m_A(t)$ has degree $r$ . Every polynomial in $A$ can be reduced modulo $m_A(t)$ to a polynomial of degree less than $r$ .

p(t)=q(t)m_A(t)+s(t),

where

\deg s<\deg m_A,

then

p(A)=s(A),

because

m_A(A)=0.

Thus polynomial functions of $A$ live in the finite-dimensional algebra

F[A]=\operatorname{span}\{I,A,A^2,\ldots,A^{r-1}\}.

The Cayley-Hamilton theorem gives the weaker but always useful reduction to degree less than $n$ .

72.12 Interpolation Definition

For many functions, $f(A)$ can be described by polynomial interpolation.

If $A$ is diagonalizable with distinct eigenvalues

\lambda_1,\ldots,\lambda_k,

choose a polynomial $p(t)$ such that

p(\lambda_i)=f(\lambda_i)

for every $i$ .

Then define

f(A)=p(A).

This is well-defined: any two such polynomials differ by a polynomial that vanishes at all eigenvalues, and hence annihilates $A$ when $A$ is diagonalizable with those eigenvalues.

For non-diagonalizable matrices, interpolation must also match derivatives up to the sizes of the Jordan blocks. This is Hermite interpolation.

72.13 Matrix Square Roots

A matrix $B$ is a square root of $A$ if

B^2=A.

We write

B=A^{1/2}

when a particular square root is chosen.

If $A$ is diagonalizable and

A=PDP^{-1},

then a square root may be constructed by taking square roots of the eigenvalues:

A^{1/2} = P D^{1/2} P^{-1},

where

D^{1/2} = \operatorname{diag} (\sqrt{\lambda_1},\ldots,\sqrt{\lambda_n}).

This requires choosing square roots of the eigenvalues.

For symmetric positive semidefinite matrices, there is a unique symmetric positive semidefinite square root. This case is especially important in statistics, covariance matrices, optimization, and numerical analysis.

72.14 Matrix Logarithm

A matrix logarithm of $A$ is a matrix $B$ such that

e^B=A.

We write

B=\log A

when a particular branch is chosen.

The matrix logarithm is more delicate than the exponential. It may not be unique, and existence depends on spectral conditions and the field.

If $A$ is diagonalizable and

A=PDP^{-1},

then one may define

\log A=P(\log D)P^{-1},

where

\log D= \operatorname{diag} (\log \lambda_1,\ldots,\log \lambda_n).

This requires choosing branches of the scalar logarithm for the eigenvalues.

The logarithm is important in Lie theory, differential equations, geometric integration, and matrix means.

72.15 Trigonometric Matrix Functions

Trigonometric functions can also be defined by power series.

For a square matrix $A$ ,

\cos A = I-\frac{A^2}{2!}+\frac{A^4}{4!}-\cdots,

and

\sin A = A-\frac{A^3}{3!}+\frac{A^5}{5!}-\cdots.

These definitions parallel the scalar power series.

Matrix trigonometric functions appear in oscillatory systems, wave equations, rotations, and second-order differential equations.

A general matrix function theory includes exponential, logarithmic, square root, and trigonometric functions, often using power series or spectral definitions.

72.16 Commuting Matrices

For scalars,

e^{x+y}=e^xe^y.

For matrices, this identity requires commutation.

AB=BA,

then

e^{A+B}=e^Ae^B.

If $A$ and $B$ do not commute, this equality may fail. This is one of the main differences between scalar functions and matrix functions. The exponential identity $e^{A+B}=e^Ae^B$ holds for commuting matrices, but not in general.

Similarly, many scalar identities become conditional in matrix algebra because multiplication is not commutative.

72.17 Functions Preserve Similarity

Matrix functions are compatible with similarity.

B=P^{-1}AP,

then for any polynomial $p$ ,

p(B)=P^{-1}p(A)P.

The same relation holds for matrix functions defined by power series or spectral calculus:

f(B)=P^{-1}f(A)P.

Thus similar matrices have similar matrix functions.

This is essential because a matrix function should describe the underlying linear transformation, not the accidental choice of basis.

72.18 Spectral Mapping

For many standard matrix functions, eigenvalues transform according to the scalar function.

Av=\lambda v,

then for a polynomial $p$ ,

p(A)v=p(\lambda)v.

Thus $v$ is also an eigenvector of $p(A)$ , with eigenvalue $p(\lambda)$ .

More generally, when $f(A)$ is defined through an appropriate functional calculus, the eigenvalues of $f(A)$ are

f(\lambda),

where $\lambda$ ranges over the eigenvalues of $A$ , with multiplicities handled according to the chosen setting.

This principle is called spectral mapping.

72.19 Matrix Functions of Normal Matrices

If $A$ is normal over $\mathbb{C}$ , then

A=UDU^*

with $U$ unitary and $D$ diagonal.

For such matrices,

f(A)=Uf(D)U^*.

This is the cleanest setting for matrix functions. The unitary matrix $U$ preserves norms and inner products, so the computation is stable and geometrically transparent.

Hermitian matrices, unitary matrices, and real symmetric matrices are important special cases.

For a Hermitian matrix, if $f$ is real-valued on the spectrum, then $f(A)$ is Hermitian.

For a positive definite Hermitian matrix, functions such as

A^{1/2}, \qquad \log A, \qquad A^\alpha

are especially well behaved.

72.20 Numerical Computation

Computing matrix functions numerically is a separate problem from defining them.

For small diagonalizable matrices, an eigenvalue decomposition may be convenient. For symmetric or Hermitian matrices, unitary diagonalization is often stable.

For general matrices, eigenvalue methods may be unstable, especially near repeated eigenvalues or defective matrices. Practical algorithms often use Schur decomposition, scaling and squaring, Padé approximants, Krylov methods, or specialized iterative methods.

For the matrix exponential, common numerical approaches include scaling and squaring with Padé approximation, Taylor methods, and methods based on Schur decomposition. General-purpose methods vary in stability depending on the matrix class.

72.21 Applications

Matrix functions appear whenever a scalar transformation must act on a linear operator.

Function	Typical use
$A^k$	Discrete dynamics, Markov chains
$e^{tA}$	Linear differential equations
$A^{1/2}$	Covariance matrices, positive definite geometry
$\log A$	Lie groups, matrix means, geometric integration
$\sin A,\cos A$	Oscillatory systems
$(A+\alpha I)^{-1}$	Regularization, resolvents
$f(L)$ for graph Laplacian $L$	Graph filters, diffusion, spectral methods

The common idea is that the matrix represents a linear transformation, and the function changes its spectral behavior.

72.22 Common Errors

The first common error is to apply a scalar function entrywise to a general matrix. For example,

e^A

usually does not mean exponentiating each entry of $A$ . Entrywise exponentiation is a different operation.

The second common error is to assume scalar identities always hold. Matrix multiplication is noncommutative, so identities such as

e^{A+B}=e^Ae^B

need commutation.

The third common error is to ignore the spectrum. Functions such as $\log A$ and $A^{1/2}$ depend on eigenvalues and branch choices.

The fourth common error is to diagonalize numerically without checking conditioning. A matrix may be diagonalizable in theory but poorly conditioned in practice.

The fifth common error is to confuse the characteristic polynomial with the minimal polynomial. The minimal polynomial gives the sharper algebraic reduction for functions of $A$ .

72.23 Summary

A matrix function applies a scalar function to a square matrix.

For polynomials,

p(A)=a_0I+a_1A+\cdots+a_kA^k.

For diagonalizable matrices,

A=PDP^{-1}

gives

f(A)=Pf(D)P^{-1}.

For Jordan blocks, derivatives of $f$ enter through the nilpotent part.

The matrix exponential is defined by

e^A=\sum_{k=0}^{\infty}\frac{A^k}{k!},

and it solves linear systems of differential equations.

Matrix functions generalize powers, exponentials, square roots, logarithms, and trigonometric functions to linear transformations. They are controlled by eigenvalues, minimal polynomials, Jordan structure, and similarity.