# Chapter 79. Schur Decomposition

# Chapter 79. Schur Decomposition

Schur decomposition is a triangular form for square matrices obtained by an orthogonal or unitary change of basis. It is one of the main tools behind modern eigenvalue algorithms.

For a complex square matrix \(A\), the Schur decomposition has the form

$$
A = QTQ^*,
$$

where \(Q\) is unitary and \(T\) is upper triangular. Equivalently,

$$
Q^*AQ = T.
$$

The diagonal entries of \(T\) are the eigenvalues of \(A\), because \(T\) is triangular and similar to \(A\). Every complex square matrix has such a decomposition. For a real square matrix, the real Schur form uses an orthogonal matrix and a real quasi-triangular matrix with \(1 \times 1\) and \(2 \times 2\) diagonal blocks.

## 79.1 Unitary Similarity

Two square matrices \(A\) and \(B\) are similar if there is an invertible matrix \(S\) such that

$$
B = S^{-1}AS.
$$

Similarity represents a change of basis. Similar matrices describe the same linear transformation in different coordinate systems. They have the same eigenvalues, determinant, trace, and characteristic polynomial.

Schur decomposition uses a special kind of similarity. It chooses \(S\) to be unitary. A matrix \(Q\) is unitary if

$$
Q^*Q = I.
$$

For real matrices, the analogous condition is orthogonality:

$$
Q^TQ = I.
$$

Unitary and orthogonal changes of basis preserve lengths and angles. This makes Schur decomposition especially important in numerical computation. It transforms a matrix without introducing the instability that may come from an arbitrary ill-conditioned basis.

## 79.2 The Complex Schur Form

Let

$$
A \in \mathbb{C}^{n \times n}.
$$

The complex Schur theorem states that there exists a unitary matrix \(Q\) and an upper triangular matrix \(T\) such that

$$
Q^*AQ = T.
$$

Equivalently,

$$
A = QTQ^*.
$$

The matrix \(T\) has the form

$$
T =
\begin{bmatrix}
\lambda_1 & t_{12} & t_{13} & \cdots & t_{1n} \\
0 & \lambda_2 & t_{23} & \cdots & t_{2n} \\
0 & 0 & \lambda_3 & \cdots & t_{3n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & \cdots & \lambda_n
\end{bmatrix}.
$$

The diagonal entries

$$
\lambda_1,\lambda_2,\ldots,\lambda_n
$$

are the eigenvalues of \(A\), counted with algebraic multiplicity.

The entries above the diagonal describe additional coupling between Schur vectors. If \(T\) is diagonal, the matrix is unitarily diagonalizable. If \(T\) has nonzero entries above the diagonal, the matrix has a triangular representation but not necessarily an orthonormal eigenbasis.

## 79.3 Schur Vectors

The columns of \(Q\) are called Schur vectors.

Write

$$
Q =
\begin{bmatrix}
q_1 & q_2 & \cdots & q_n
\end{bmatrix}.
$$

Because \(Q\) is unitary, these vectors form an orthonormal basis of \(\mathbb{C}^n\). The equation

$$
AQ = QT
$$

shows how \(A\) acts on this basis.

The first column gives

$$
Aq_1 = \lambda_1 q_1.
$$

Thus \(q_1\) is an eigenvector corresponding to \(\lambda_1\).

The second column gives

$$
Aq_2 = t_{12}q_1 + \lambda_2 q_2.
$$

The third column gives

$$
Aq_3 = t_{13}q_1 + t_{23}q_2 + \lambda_3 q_3.
$$

In general, \(Aq_j\) lies in the span of

$$
q_1,\ldots,q_j.
$$

Thus the Schur basis exposes a nested sequence of invariant subspaces.

## 79.4 Invariant Subspaces

Let

$$
\mathcal{S}_k = \operatorname{span}(q_1,\ldots,q_k).
$$

Since \(T\) is upper triangular, the action of \(A\) maps each \(\mathcal{S}_k\) into itself:

$$
A\mathcal{S}_k \subseteq \mathcal{S}_k.
$$

Thus Schur decomposition produces a chain

$$
\{0\}
\subset
\mathcal{S}_1
\subset
\mathcal{S}_2
\subset
\cdots
\subset
\mathcal{S}_n =
\mathbb{C}^n.
$$

Each subspace in this chain is invariant under \(A\).

This is an important structural fact. It says that even when a matrix lacks a full set of orthogonal eigenvectors, it still admits an orthonormal basis in which its action is triangular.

## 79.5 Proof Idea

The proof of Schur decomposition uses induction on the dimension.

Every complex square matrix has at least one eigenvalue. Let \(\lambda_1\) be an eigenvalue of \(A\), and choose a corresponding unit eigenvector \(q_1\). Extend \(q_1\) to an orthonormal basis of \(\mathbb{C}^n\). Let

$$
Q_1 =
\begin{bmatrix}
q_1 & q_2 & \cdots & q_n
\end{bmatrix}
$$

be the unitary matrix whose columns are this basis.

Then

$$
Q_1^*AQ_1
$$

has the block form

$$
\begin{bmatrix}
\lambda_1 & * \\
0 & A_1
\end{bmatrix}.
$$

The lower-left block is zero because \(Aq_1=\lambda_1q_1\). Now apply the same argument to the smaller matrix \(A_1\). Repeating this process produces an upper triangular matrix.

The key reason the argument works over \(\mathbb{C}\) is that every complex polynomial of positive degree has a root. Therefore every complex square matrix has an eigenvalue.

## 79.6 A Two by Two Example

Let

$$
A =
\begin{bmatrix}
1 & 3 \\
-3 & 7
\end{bmatrix}.
$$

This matrix has characteristic polynomial

$$
\det(A-\lambda I) =
\det
\begin{bmatrix}
1-\lambda & 3 \\
-3 & 7-\lambda
\end{bmatrix}.
$$

Thus

$$
\det(A-\lambda I) =
(1-\lambda)(7-\lambda)+9.
$$

Expanding,

$$
(1-\lambda)(7-\lambda)+9 =
\lambda^2 - 8\lambda + 16 =
(\lambda-4)^2.
$$

The only eigenvalue is

$$
\lambda = 4.
$$

An eigenvector satisfies

$$
(A-4I)x=0.
$$

Since

$$
A-4I =
\begin{bmatrix}
-3 & 3 \\
-3 & 3
\end{bmatrix},
$$

we may take

$$
q_1 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 \\
1
\end{bmatrix}.
$$

Choose an orthonormal vector perpendicular to \(q_1\):

$$
q_2 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
-1 \\
1
\end{bmatrix}.
$$

Then

$$
Q =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 & -1 \\
1 & 1
\end{bmatrix}.
$$

Compute

$$
T = Q^TAQ.
$$

First,

$$
AQ =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 & 3 \\
-3 & 7
\end{bmatrix}
\begin{bmatrix}
1 & -1 \\
1 & 1
\end{bmatrix} =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
4 & 2 \\
4 & 10
\end{bmatrix}.
$$

Then

$$
Q^TAQ =
\frac{1}{2}
\begin{bmatrix}
1 & 1 \\
-1 & 1
\end{bmatrix}
\begin{bmatrix}
4 & 2 \\
4 & 10
\end{bmatrix} =
\begin{bmatrix}
4 & 6 \\
0 & 4
\end{bmatrix}.
$$

Thus

$$
A = QTQ^T,
$$

where

$$
T =
\begin{bmatrix}
4 & 6 \\
0 & 4
\end{bmatrix}.
$$

This is a Schur decomposition. The matrix is triangular but not diagonal in this orthonormal basis.

## 79.7 Normal Matrices

A complex matrix \(A\) is normal if

$$
A^*A = AA^*.
$$

Normal matrices include Hermitian, real symmetric, skew-Hermitian, unitary, and orthogonal matrices.

Schur decomposition gives a simple route to the spectral theorem for normal matrices. Suppose

$$
A = QTQ^*.
$$

If \(A\) is normal, then \(T\) is also normal, because unitary similarity preserves normality. But an upper triangular normal matrix must be diagonal. Therefore

$$
T =
\operatorname{diag}(\lambda_1,\ldots,\lambda_n).
$$

Hence

$$
A =
Q
\operatorname{diag}(\lambda_1,\ldots,\lambda_n)
Q^*.
$$

Thus every normal matrix is unitarily diagonalizable. Schur decomposition therefore extends the spectral theorem: all complex square matrices can be unitarily triangularized, while normal matrices can be unitarily diagonalized.

## 79.8 Relation to Diagonalization

Diagonalization seeks

$$
A = X\Lambda X^{-1},
$$

where \(\Lambda\) is diagonal.

Schur decomposition seeks

$$
A = QTQ^*,
$$

where \(T\) is upper triangular.

The differences are important.

| Feature | Diagonalization | Schur decomposition |
|---|---|---|
| Form | \(A=X\Lambda X^{-1}\) | \(A=QTQ^*\) |
| Middle factor | Diagonal | Upper triangular |
| Basis | Eigenbasis | Orthonormal Schur basis |
| Exists for every complex square matrix | No | Yes |
| Change of basis | Invertible | Unitary |
| Numerical behavior | May be ill-conditioned | Usually stable |

Diagonalization gives a simpler middle factor, but it may not exist. Schur decomposition always exists over \(\mathbb{C}\), and it uses a well-conditioned change of basis.

## 79.9 Defective Matrices

A matrix is defective if it does not have enough linearly independent eigenvectors to be diagonalized.

For example,

$$
A =
\begin{bmatrix}
1 & 1 \\
0 & 1
\end{bmatrix}
$$

has the only eigenvalue \(1\). Its eigenspace is one-dimensional, so it cannot be diagonalized.

However, it is already in Schur form:

$$
A =
\begin{bmatrix}
1 & 1 \\
0 & 1
\end{bmatrix}.
$$

Taking

$$
Q=I,
\qquad
T=A,
$$

we have

$$
A = QTQ^*.
$$

Thus Schur decomposition still applies to defective matrices. It replaces the impossible goal of diagonalization by the always possible goal of triangularization.

## 79.10 Real Schur Decomposition

For a real matrix \(A\), complex eigenvalues may occur. Since complex eigenvalues of a real matrix occur in conjugate pairs, a fully triangular real Schur form may require complex entries.

To stay within real arithmetic, one uses the real Schur decomposition:

$$
A = QTQ^T,
$$

where \(Q\) is real orthogonal and \(T\) is real quasi-upper triangular.

The matrix \(T\) has \(1 \times 1\) and \(2 \times 2\) blocks on its diagonal:

$$
T =
\begin{bmatrix}
T_{11} & T_{12} & \cdots & T_{1k} \\
0 & T_{22} & \cdots & T_{2k} \\
\vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & T_{kk}
\end{bmatrix}.
$$

Each diagonal block is either \(1 \times 1\) or \(2 \times 2\). A \(1 \times 1\) block corresponds to a real eigenvalue. A \(2 \times 2\) block corresponds to a pair of complex conjugate eigenvalues.

## 79.11 A Real Two by Two Rotation

Consider the real rotation matrix

$$
A =
\begin{bmatrix}
0 & -1 \\
1 & 0
\end{bmatrix}.
$$

Its characteristic polynomial is

$$
\lambda^2 + 1,
$$

so its eigenvalues are

$$
i
\quad \text{and} \quad
-i.
$$

Over \(\mathbb{C}\), the matrix has a triangular Schur form with diagonal entries \(i\) and \(-i\). Over \(\mathbb{R}\), it cannot be triangularized into a real upper triangular matrix, because a real triangular matrix has real diagonal entries.

The real Schur form keeps the \(2 \times 2\) block:

$$
T =
\begin{bmatrix}
0 & -1 \\
1 & 0
\end{bmatrix}.
$$

Here the whole matrix is one \(2 \times 2\) block representing the conjugate eigenvalue pair.

## 79.12 Ordering Eigenvalues

The eigenvalues on the diagonal of a Schur form can often be ordered. By applying additional unitary transformations, one can arrange selected eigenvalues in chosen positions along the diagonal.

This is useful when separating stable and unstable modes, slow and fast modes, or eigenvalues inside and outside a region of the complex plane.

For example, in numerical control theory, one may want eigenvalues with negative real part to appear in the leading block. Then the corresponding leading Schur vectors span an invariant subspace associated with stable dynamics.

The ordered Schur form is usually more useful than raw eigenvectors when eigenvalues are clustered or eigenvectors are poorly conditioned.

## 79.13 Schur Form and Eigenvalues

Since \(A\) and \(T\) are similar, they have the same characteristic polynomial:

$$
\det(\lambda I - A) =
\det(\lambda I - T).
$$

Because \(T\) is upper triangular,

$$
\det(\lambda I - T) =
\prod_{i=1}^{n}(\lambda - t_{ii}).
$$

Therefore the eigenvalues of \(A\) are exactly the diagonal entries of \(T\):

$$
\lambda_i = t_{ii}.
$$

This is one of the main reasons Schur decomposition is used in eigenvalue algorithms. It reduces eigenvalue extraction to reading the diagonal of a triangular or quasi-triangular matrix.

## 79.14 Schur Decomposition and the QR Algorithm

The QR algorithm computes eigenvalues by producing a sequence of orthogonally similar matrices.

Starting with

$$
A_0 = A,
$$

one computes

$$
A_k = Q_kR_k
$$

and then forms

$$
A_{k+1} = R_kQ_k.
$$

Since

$$
A_{k+1} =
Q_k^TA_kQ_k,
$$

each step preserves the eigenvalues.

With shifts and practical refinements, the QR algorithm tends toward Schur form. For real matrices, it tends toward real Schur form. For complex matrices, it tends toward complex triangular Schur form. The Schur decomposition is therefore both a theoretical target and a practical endpoint for eigenvalue computation.

## 79.15 Matrix Functions from Schur Form

Schur decomposition helps define and compute functions of matrices.

Suppose

$$
A = QTQ^*.
$$

For a suitable function \(f\), define

$$
f(A) = Qf(T)Q^*.
$$

Since \(T\) is triangular, computing \(f(T)\) is often easier than computing \(f(A)\) directly. The diagonal entries of \(f(T)\) are

$$
f(t_{11}), f(t_{22}), \ldots, f(t_{nn}).
$$

The entries above the diagonal are determined by recurrence relations involving divided differences or by specialized algorithms.

This approach is used for matrix exponentials, logarithms, square roots, and other matrix functions.

## 79.16 Schur Form and Stability

Schur decomposition is valuable because the factor \(Q\) is unitary. Unitary transformations preserve the Euclidean norm:

$$
\|Qx\|_2 = \|x\|_2.
$$

They also preserve the Frobenius norm:

$$
\|Q^*AQ\|_F = \|A\|_F.
$$

This matters in floating point arithmetic. When a matrix is transformed by poorly conditioned eigenvector matrices, small perturbations can be amplified. Schur decomposition avoids this by using orthonormal bases.

Thus many numerical algorithms prefer Schur vectors over eigenvectors, especially when eigenvectors are nearly linearly dependent.

## 79.17 Comparison with Jordan Form

Jordan canonical form writes a complex matrix as

$$
A = XJX^{-1},
$$

where \(J\) is block diagonal with Jordan blocks.

Jordan form gives detailed algebraic structure, but it is numerically unstable. Small perturbations can change the sizes of Jordan blocks. The eigenvector matrix \(X\) may also be ill-conditioned.

Schur decomposition writes

$$
A = QTQ^*.
$$

It gives less explicit algebraic information than Jordan form, but it is much more stable numerically.

| Feature | Jordan form | Schur form |
|---|---|---|
| Middle matrix | Jordan block matrix | Upper triangular matrix |
| Change of basis | General invertible matrix | Unitary matrix |
| Exists over \(\mathbb{C}\) | Yes | Yes |
| Numerically stable | No | Yes |
| Used in computation | Rarely | Commonly |
| Shows eigenvalues | Diagonal | Diagonal |

Jordan form is mainly theoretical. Schur form is both theoretical and computational.

## 79.18 Generalized Schur Decomposition

There is also a generalized Schur decomposition for a pair of square matrices \(A\) and \(B\).

It has the form

$$
A = QSZ^*,
\qquad
B = QTZ^*,
$$

where \(Q\) and \(Z\) are unitary, while \(S\) and \(T\) are upper triangular.

This decomposition is also called the QZ decomposition. It is used for generalized eigenvalue problems of the form

$$
Ax = \lambda Bx.
$$

The generalized eigenvalues are obtained from diagonal ratios:

$$
\lambda_i = \frac{s_{ii}}{t_{ii}},
$$

when \(t_{ii}\ne 0\).

## 79.19 Algorithmic Form

A full production algorithm for Schur decomposition is more involved than the elementary QR iteration. A typical dense eigenvalue pipeline has the following shape:

```text
schur_decomposition(A):
    reduce A to Hessenberg form H by orthogonal or unitary transformations

    repeat until H is in Schur or quasi-Schur form:
        choose shifts
        perform implicit shifted QR steps
        deflate converged eigenvalues or blocks

    accumulate the orthogonal or unitary transformations into Q

    return Q, T
```

The Hessenberg reduction is used because it preserves eigenvalues and greatly reduces the cost of subsequent QR iterations. A Hessenberg matrix is nearly triangular, having zeros below the first subdiagonal.

For symmetric or Hermitian matrices, this process simplifies further, because the Schur form becomes diagonal and the Hessenberg form becomes tridiagonal.

## 79.20 Summary

Schur decomposition expresses a square matrix by unitary or orthogonal similarity.

For a complex matrix,

$$
A = QTQ^*,
$$

where \(Q\) is unitary and \(T\) is upper triangular.

For a real matrix,

$$
A = QTQ^T,
$$

where \(Q\) is orthogonal and \(T\) is real quasi-upper triangular.

The diagonal entries of the complex Schur form are the eigenvalues of \(A\). In the real Schur form, real eigenvalues appear as \(1 \times 1\) blocks, while complex conjugate pairs appear as \(2 \times 2\) blocks.

Schur decomposition is less explicit than Jordan form but far more useful in computation. It is stable, always exists over the complex numbers, supports eigenvalue algorithms, exposes invariant subspaces, and provides a reliable basis for computing matrix functions.
