# Chapter 82. Hessenberg Form

# Chapter 82. Hessenberg Form

Hessenberg form is a nearly triangular matrix form used mainly in eigenvalue computation. It is less restrictive than triangular form, but structured enough to make algorithms faster.

A square matrix is in upper Hessenberg form if all entries below the first subdiagonal are zero. Thus an \(n \times n\) matrix \(H\) is upper Hessenberg when

$$
h_{ij}=0
\qquad
\text{whenever}
\qquad
i>j+1.
$$

It has the pattern

$$
H =
\begin{bmatrix}
h_{11} & h_{12} & h_{13} & \cdots & h_{1n} \\
h_{21} & h_{22} & h_{23} & \cdots & h_{2n} \\
0 & h_{32} & h_{33} & \cdots & h_{3n} \\
0 & 0 & h_{43} & \cdots & h_{4n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & h_{n,n-1} & h_{nn}
\end{bmatrix}.
$$

The upper Hessenberg form is important because the QR algorithm preserves it, and a QR step on a Hessenberg matrix costs much less than a QR step on a general dense matrix. Standard eigenvalue algorithms therefore first reduce a general matrix to Hessenberg form, then apply shifted QR iteration.

## 82.1 Definition

An upper Hessenberg matrix is a square matrix whose entries are zero below the first subdiagonal.

For example,

$$
\begin{bmatrix}
2 & 1 & 4 & 3 \\
5 & 6 & 7 & 8 \\
0 & 9 & 1 & 2 \\
0 & 0 & 3 & 4
\end{bmatrix}
$$

is upper Hessenberg.

The entries immediately below the diagonal may be nonzero:

$$
h_{21}, h_{32}, \ldots, h_{n,n-1}.
$$

All entries below those positions must be zero.

A lower Hessenberg matrix is defined similarly, with all entries above the first superdiagonal equal to zero. In numerical eigenvalue algorithms, upper Hessenberg form is the more common convention.

## 82.2 Relation to Triangular Form

An upper triangular matrix has the form

$$
u_{ij}=0
\qquad
\text{whenever}
\qquad
i>j.
$$

An upper Hessenberg matrix relaxes this by allowing one extra nonzero diagonal below the main diagonal:

$$
h_{ij}=0
\qquad
\text{whenever}
\qquad
i>j+1.
$$

Thus every upper triangular matrix is upper Hessenberg, but not every upper Hessenberg matrix is upper triangular.

The difference is small in structure but large in algorithmic value. Hessenberg form is easier to reach by orthogonal similarity than triangular form, and it keeps enough sparsity to accelerate QR iteration.

## 82.3 Why Hessenberg Form Appears

Eigenvalue algorithms use similarity transformations because similarity preserves eigenvalues.

If

$$
H = Q^TAQ
$$

where \(Q\) is orthogonal, then \(H\) and \(A\) have the same eigenvalues. Hessenberg reduction seeks such a matrix \(Q\) so that \(H\) is upper Hessenberg:

$$
Q^TAQ = H.
$$

Equivalently,

$$
A = QHQ^T.
$$

The purpose is not to solve a linear system. The purpose is to prepare the matrix for eigenvalue iteration. Hessenberg form is the standard intermediate form between a dense matrix and Schur form.

## 82.4 Orthogonal Similarity

Hessenberg reduction is usually done by orthogonal similarity transformations.

A transformation

$$
A \mapsto Q^TAQ
$$

is called an orthogonal similarity transformation when

$$
Q^TQ = I.
$$

It preserves eigenvalues because

$$
Q^TAQ
$$

is similar to \(A\), with change of basis matrix \(Q\).

It also preserves numerical stability because orthogonal matrices preserve Euclidean length:

$$
\|Qx\|_2 = \|x\|_2.
$$

This matters in floating point arithmetic. A general similarity transformation can amplify error if the change-of-basis matrix is ill-conditioned. Orthogonal similarity avoids that source of instability.

## 82.5 Reduction by Householder Reflections

The standard dense method for reducing a general matrix to Hessenberg form uses Householder reflections.

A Householder reflection has the form

$$
P = I - 2vv^T,
$$

where

$$
\|v\|_2 = 1.
$$

It is symmetric and orthogonal:

$$
P^T=P,
\qquad
P^TP=I.
$$

At step \(k\), a Householder reflection is chosen to zero entries below the first subdiagonal in column \(k\). That is, it zeros the entries

$$
a_{k+2,k}, a_{k+3,k}, \ldots, a_{n,k}.
$$

The transformation is applied from both sides:

$$
A \leftarrow P_k A P_k.
$$

Applying it from the left creates zeros in a column. Applying it from the right preserves similarity and therefore preserves eigenvalues.

After \(n-2\) such steps, the matrix is upper Hessenberg.

## 82.6 A Four by Four Pattern

For a \(4 \times 4\) matrix,

$$
A =
\begin{bmatrix}
* & * & * & * \\
* & * & * & * \\
* & * & * & * \\
* & * & * & *
\end{bmatrix},
$$

Hessenberg reduction aims for

$$
H =
\begin{bmatrix}
* & * & * & * \\
* & * & * & * \\
0 & * & * & * \\
0 & 0 & * & *
\end{bmatrix}.
$$

Only two entries are forced to zero below the first subdiagonal:

$$
h_{31}=0,
\qquad
h_{41}=0,
\qquad
h_{42}=0.
$$

For larger matrices, the same pattern continues. Column \(1\) has zeros below row \(2\). Column \(2\) has zeros below row \(3\). Column \(3\) has zeros below row \(4\), and so on.

## 82.7 Example of Hessenberg Structure

The matrix

$$
H =
\begin{bmatrix}
4 & 1 & -2 & 3 \\
2 & 5 & 0 & 1 \\
0 & 7 & 6 & 8 \\
0 & 0 & -3 & 2
\end{bmatrix}
$$

is upper Hessenberg.

The entries below the first subdiagonal are

$$
h_{31}, h_{41}, h_{42}.
$$

They are all zero. The entries

$$
h_{21}, h_{32}, h_{43}
$$

are allowed to be nonzero.

By contrast,

$$
\begin{bmatrix}
4 & 1 & -2 & 3 \\
2 & 5 & 0 & 1 \\
9 & 7 & 6 & 8 \\
0 & 0 & -3 & 2
\end{bmatrix}
$$

is not upper Hessenberg because the entry in position \((3,1)\) is nonzero.

## 82.8 Symmetric Matrices Become Tridiagonal

If \(A\) is symmetric and \(H=Q^TAQ\), then \(H\) is also symmetric:

$$
H^T =
(Q^TAQ)^T =
Q^TA^TQ =
Q^TAQ =
H.
$$

If \(H\) is both symmetric and upper Hessenberg, then it must be tridiagonal.

A tridiagonal matrix has nonzero entries only on the main diagonal, the first subdiagonal, and the first superdiagonal:

$$
T =
\begin{bmatrix}
d_1 & e_1 & 0 & \cdots & 0 \\
e_1 & d_2 & e_2 & \cdots & 0 \\
0 & e_2 & d_3 & \cdots & 0 \\
\vdots & \vdots & \vdots & \ddots & e_{n-1} \\
0 & 0 & 0 & e_{n-1} & d_n
\end{bmatrix}.
$$

Thus Hessenberg reduction specializes to tridiagonal reduction for symmetric matrices. This is a major simplification in symmetric eigenvalue problems.

## 82.9 Preservation under QR Iteration

The QR algorithm repeatedly factors a matrix as

$$
A_k = Q_kR_k
$$

and forms

$$
A_{k+1} = R_kQ_k.
$$

Since

$$
A_{k+1}=Q_k^TA_kQ_k,
$$

each step is an orthogonal similarity transformation and preserves eigenvalues.

If \(A_k\) is upper Hessenberg, then \(A_{k+1}\) is also upper Hessenberg. This preservation property is the main reason Hessenberg form is used before QR iteration.

Without Hessenberg form, each QR step on a dense matrix is expensive. With Hessenberg form, the structure limits the amount of fill-in and reduces the work per iteration.

## 82.10 Cost Advantage

For a dense \(n \times n\) matrix, reducing the matrix to upper Hessenberg form costs on the order of

$$
n^3
$$

operations.

This is a one-time preprocessing cost.

After reduction, each QR iteration can be performed in

$$
O(n^2)
$$

operations rather than

$$
O(n^3)
$$

operations for a general dense matrix. This is the central computational advantage. The QR algorithm becomes practical because it works on the Hessenberg form instead of the original dense matrix.

## 82.11 Unreduced Hessenberg Matrices

An upper Hessenberg matrix is called unreduced if all its subdiagonal entries are nonzero:

$$
h_{21}, h_{32}, \ldots, h_{n,n-1} \ne 0.
$$

If some subdiagonal entry is zero, say

$$
h_{k+1,k}=0,
$$

then the matrix splits into block upper triangular form:

$$
H =
\begin{bmatrix}
H_{11} & H_{12} \\
0 & H_{22}
\end{bmatrix}.
$$

The eigenvalues of \(H\) are then the eigenvalues of \(H_{11}\) together with the eigenvalues of \(H_{22}\). The eigenvalue problem separates into smaller problems.

This splitting is called deflation in eigenvalue algorithms.

## 82.12 Deflation

Deflation occurs when a subdiagonal entry becomes zero or small enough to treat as zero.

Suppose

$$
h_{k+1,k} \approx 0.
$$

Then the matrix is nearly block upper triangular:

$$
H \approx
\begin{bmatrix}
H_{11} & H_{12} \\
0 & H_{22}
\end{bmatrix}.
$$

The QR algorithm can then continue separately on the diagonal blocks. This reduces the active problem size.

In practice, deflation is one of the mechanisms that makes QR eigenvalue computation efficient. The algorithm gradually drives subdiagonal entries toward zero, allowing eigenvalues or small blocks to split off.

## 82.13 Real Hessenberg Form and Complex Eigenvalues

A real upper Hessenberg matrix can have complex eigenvalues. Since the matrix entries remain real, complex eigenvalues appear in conjugate pairs.

In the real QR algorithm, convergence usually produces a real Schur form rather than a fully diagonal matrix. The final form contains \(1 \times 1\) blocks for real eigenvalues and \(2 \times 2\) blocks for complex conjugate pairs.

Hessenberg form is therefore an intermediate form. It is not the final eigenvalue form. It is the structured form from which the QR algorithm efficiently approaches Schur form.

## 82.14 Algorithm

The following high-level algorithm reduces a real square matrix to upper Hessenberg form.

```text
hessenberg_reduction(A):
    n = number of rows of A
    Q = identity matrix of size n

    for k = 1 to n - 2:
        choose a Householder reflector Pk
        that zeros A[k+2:n, k]

        A = Pk A Pk
        Q = Q Pk

    H = A
    return Q, H
```

The result satisfies

$$
H = Q^TAQ.
$$

Equivalently,

$$
A = QHQ^T.
$$

In practical implementations, the full Householder matrices \(P_k\) are not formed explicitly. Their defining vectors are stored compactly and applied to the relevant submatrices.

## 82.15 Storage in Practice

Dense Hessenberg reduction commonly overwrites the original matrix.

The upper Hessenberg part stores \(H\). The Householder vectors used to construct \(Q\) may be stored in the entries below the first subdiagonal, since those entries are zero in \(H\).

| Object | Practical storage |
|---|---|
| \(H\) | Upper Hessenberg part of the array |
| Householder vectors | Strictly lower unused region |
| \(Q\) | Stored implicitly |
| Eigenvalues | Later extracted from Schur form |

This is similar in spirit to Householder QR storage. The orthogonal transformations are stored implicitly unless the explicit matrix \(Q\) is needed.

## 82.16 Hessenberg Form and Krylov Subspaces

Hessenberg matrices also appear in Krylov subspace methods.

Given a matrix \(A\) and a vector \(b\), the Krylov subspace of order \(k\) is

$$
\mathcal{K}_k(A,b) =
\operatorname{span}
\{b, Ab, A^2b, \ldots, A^{k-1}b\}.
$$

Arnoldi iteration constructs an orthonormal basis

$$
q_1,\ldots,q_k
$$

for this subspace. In that basis, the projected action of \(A\) is represented by an upper Hessenberg matrix:

$$
AQ_k = Q_{k+1}\overline{H}_k.
$$

The Hessenberg matrix \(\overline{H}_k\) is small compared with \(A\). It captures how \(A\) acts on the Krylov basis. This is the foundation for many iterative eigenvalue and linear system methods.

## 82.17 Arnoldi Hessenberg Matrix

In Arnoldi iteration, each new vector is obtained by applying \(A\) to the current basis vector and then orthogonalizing against previous basis vectors.

The recurrence has the form

$$
Aq_j =
h_{1j}q_1
+
h_{2j}q_2
+
\cdots
+
h_{j+1,j}q_{j+1}.
$$

Only basis vectors up to \(q_{j+1}\) appear. This is why the coefficient matrix is upper Hessenberg.

The Hessenberg structure encodes the short upper limit in the recurrence. It records that the \(j\)-th image \(Aq_j\) may create one new direction, but no more than one.

## 82.18 Comparison with Other Forms

| Form | Pattern | Main use |
|---|---|---|
| Upper triangular | Zeros below diagonal | Schur form, triangular solves |
| Upper Hessenberg | Zeros below first subdiagonal | General eigenvalue computation |
| Tridiagonal | Zeros outside three diagonals | Symmetric eigenvalue computation |
| Bidiagonal | Nonzeros on main and one adjacent diagonal | Singular value computation |
| Diagonal | Nonzeros only on main diagonal | Fully decoupled action |

Hessenberg form is a compromise. It is more structured than a dense matrix but less restrictive than triangular form. That compromise makes it reachable by stable transformations and useful for iterative eigenvalue algorithms.

## 82.19 Common Pitfalls

A Hessenberg matrix should not be confused with a triangular matrix. The first subdiagonal is allowed to contain nonzero entries.

Hessenberg reduction should also not be confused with Gaussian elimination. Gaussian elimination uses row operations and changes the eigenvalues in general. Hessenberg reduction uses similarity transformations and preserves eigenvalues.

Finally, Hessenberg form does not usually reveal eigenvalues directly. It prepares the matrix for QR iteration. The eigenvalues are obtained after further reduction toward Schur form.

## 82.20 Summary

An upper Hessenberg matrix has zeros below the first subdiagonal:

$$
h_{ij}=0
\qquad
\text{for}
\qquad
i>j+1.
$$

Every square matrix can be reduced to upper Hessenberg form by orthogonal similarity transformations:

$$
H = Q^TAQ.
$$

This reduction preserves eigenvalues and improves numerical behavior. Its main use is as a preprocessing step for the QR algorithm. Since QR iteration preserves Hessenberg structure, each iteration becomes much cheaper than it would be on a dense matrix.

For symmetric matrices, Hessenberg form becomes tridiagonal form. In Krylov methods, Hessenberg matrices arise naturally from Arnoldi iteration. Thus Hessenberg form connects direct eigenvalue algorithms, iterative methods, and the structural theory of matrix computation.
