Skip to content

Chapter 82. Hessenberg Form

Hessenberg form is a nearly triangular matrix form used mainly in eigenvalue computation. It is less restrictive than triangular form, but structured enough to make algorithms faster.

A square matrix is in upper Hessenberg form if all entries below the first subdiagonal are zero. Thus an n×nn \times n matrix HH is upper Hessenberg when

hij=0wheneveri>j+1. h_{ij}=0 \qquad \text{whenever} \qquad i>j+1.

It has the pattern

H=[h11h12h13h1nh21h22h23h2n0h32h33h3n00h43h4n000hn,n1hnn]. H = \begin{bmatrix} h_{11} & h_{12} & h_{13} & \cdots & h_{1n} \\ h_{21} & h_{22} & h_{23} & \cdots & h_{2n} \\ 0 & h_{32} & h_{33} & \cdots & h_{3n} \\ 0 & 0 & h_{43} & \cdots & h_{4n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & h_{n,n-1} & h_{nn} \end{bmatrix}.

The upper Hessenberg form is important because the QR algorithm preserves it, and a QR step on a Hessenberg matrix costs much less than a QR step on a general dense matrix. Standard eigenvalue algorithms therefore first reduce a general matrix to Hessenberg form, then apply shifted QR iteration.

82.1 Definition

An upper Hessenberg matrix is a square matrix whose entries are zero below the first subdiagonal.

For example,

[2143567809120034] \begin{bmatrix} 2 & 1 & 4 & 3 \\ 5 & 6 & 7 & 8 \\ 0 & 9 & 1 & 2 \\ 0 & 0 & 3 & 4 \end{bmatrix}

is upper Hessenberg.

The entries immediately below the diagonal may be nonzero:

h21,h32,,hn,n1. h_{21}, h_{32}, \ldots, h_{n,n-1}.

All entries below those positions must be zero.

A lower Hessenberg matrix is defined similarly, with all entries above the first superdiagonal equal to zero. In numerical eigenvalue algorithms, upper Hessenberg form is the more common convention.

82.2 Relation to Triangular Form

An upper triangular matrix has the form

uij=0wheneveri>j. u_{ij}=0 \qquad \text{whenever} \qquad i>j.

An upper Hessenberg matrix relaxes this by allowing one extra nonzero diagonal below the main diagonal:

hij=0wheneveri>j+1. h_{ij}=0 \qquad \text{whenever} \qquad i>j+1.

Thus every upper triangular matrix is upper Hessenberg, but not every upper Hessenberg matrix is upper triangular.

The difference is small in structure but large in algorithmic value. Hessenberg form is easier to reach by orthogonal similarity than triangular form, and it keeps enough sparsity to accelerate QR iteration.

82.3 Why Hessenberg Form Appears

Eigenvalue algorithms use similarity transformations because similarity preserves eigenvalues.

If

H=QTAQ H = Q^TAQ

where QQ is orthogonal, then HH and AA have the same eigenvalues. Hessenberg reduction seeks such a matrix QQ so that HH is upper Hessenberg:

QTAQ=H. Q^TAQ = H.

Equivalently,

A=QHQT. A = QHQ^T.

The purpose is not to solve a linear system. The purpose is to prepare the matrix for eigenvalue iteration. Hessenberg form is the standard intermediate form between a dense matrix and Schur form.

82.4 Orthogonal Similarity

Hessenberg reduction is usually done by orthogonal similarity transformations.

A transformation

AQTAQ A \mapsto Q^TAQ

is called an orthogonal similarity transformation when

QTQ=I. Q^TQ = I.

It preserves eigenvalues because

QTAQ Q^TAQ

is similar to AA, with change of basis matrix QQ.

It also preserves numerical stability because orthogonal matrices preserve Euclidean length:

Qx2=x2. \|Qx\|_2 = \|x\|_2.

This matters in floating point arithmetic. A general similarity transformation can amplify error if the change-of-basis matrix is ill-conditioned. Orthogonal similarity avoids that source of instability.

82.5 Reduction by Householder Reflections

The standard dense method for reducing a general matrix to Hessenberg form uses Householder reflections.

A Householder reflection has the form

P=I2vvT, P = I - 2vv^T,

where

v2=1. \|v\|_2 = 1.

It is symmetric and orthogonal:

PT=P,PTP=I. P^T=P, \qquad P^TP=I.

At step kk, a Householder reflection is chosen to zero entries below the first subdiagonal in column kk. That is, it zeros the entries

ak+2,k,ak+3,k,,an,k. a_{k+2,k}, a_{k+3,k}, \ldots, a_{n,k}.

The transformation is applied from both sides:

APkAPk. A \leftarrow P_k A P_k.

Applying it from the left creates zeros in a column. Applying it from the right preserves similarity and therefore preserves eigenvalues.

After n2n-2 such steps, the matrix is upper Hessenberg.

82.6 A Four by Four Pattern

For a 4×44 \times 4 matrix,

$$ A = \begin{bmatrix}

  • & * & * & * \
  • & * & * & * \
  • & * & * & * \
  • & * & * & * \end{bmatrix}, $$

Hessenberg reduction aims for

$$ H = \begin{bmatrix}

  • & * & * & * \
  • & * & * & * \ 0 & * & * & * \ 0 & 0 & * & * \end{bmatrix}. $$

Only two entries are forced to zero below the first subdiagonal:

h31=0,h41=0,h42=0. h_{31}=0, \qquad h_{41}=0, \qquad h_{42}=0.

For larger matrices, the same pattern continues. Column 11 has zeros below row 22. Column 22 has zeros below row 33. Column 33 has zeros below row 44, and so on.

82.7 Example of Hessenberg Structure

The matrix

H=[4123250107680032] H = \begin{bmatrix} 4 & 1 & -2 & 3 \\ 2 & 5 & 0 & 1 \\ 0 & 7 & 6 & 8 \\ 0 & 0 & -3 & 2 \end{bmatrix}

is upper Hessenberg.

The entries below the first subdiagonal are

h31,h41,h42. h_{31}, h_{41}, h_{42}.

They are all zero. The entries

h21,h32,h43 h_{21}, h_{32}, h_{43}

are allowed to be nonzero.

By contrast,

[4123250197680032] \begin{bmatrix} 4 & 1 & -2 & 3 \\ 2 & 5 & 0 & 1 \\ 9 & 7 & 6 & 8 \\ 0 & 0 & -3 & 2 \end{bmatrix}

is not upper Hessenberg because the entry in position (3,1)(3,1) is nonzero.

82.8 Symmetric Matrices Become Tridiagonal

If AA is symmetric and H=QTAQH=Q^TAQ, then HH is also symmetric:

HT=(QTAQ)T=QTATQ=QTAQ=H. H^T = (Q^TAQ)^T = Q^TA^TQ = Q^TAQ = H.

If HH is both symmetric and upper Hessenberg, then it must be tridiagonal.

A tridiagonal matrix has nonzero entries only on the main diagonal, the first subdiagonal, and the first superdiagonal:

T=[d1e100e1d2e200e2d30en1000en1dn]. T = \begin{bmatrix} d_1 & e_1 & 0 & \cdots & 0 \\ e_1 & d_2 & e_2 & \cdots & 0 \\ 0 & e_2 & d_3 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & e_{n-1} \\ 0 & 0 & 0 & e_{n-1} & d_n \end{bmatrix}.

Thus Hessenberg reduction specializes to tridiagonal reduction for symmetric matrices. This is a major simplification in symmetric eigenvalue problems.

82.9 Preservation under QR Iteration

The QR algorithm repeatedly factors a matrix as

Ak=QkRk A_k = Q_kR_k

and forms

Ak+1=RkQk. A_{k+1} = R_kQ_k.

Since

Ak+1=QkTAkQk, A_{k+1}=Q_k^TA_kQ_k,

each step is an orthogonal similarity transformation and preserves eigenvalues.

If AkA_k is upper Hessenberg, then Ak+1A_{k+1} is also upper Hessenberg. This preservation property is the main reason Hessenberg form is used before QR iteration.

Without Hessenberg form, each QR step on a dense matrix is expensive. With Hessenberg form, the structure limits the amount of fill-in and reduces the work per iteration.

82.10 Cost Advantage

For a dense n×nn \times n matrix, reducing the matrix to upper Hessenberg form costs on the order of

n3 n^3

operations.

This is a one-time preprocessing cost.

After reduction, each QR iteration can be performed in

O(n2) O(n^2)

operations rather than

O(n3) O(n^3)

operations for a general dense matrix. This is the central computational advantage. The QR algorithm becomes practical because it works on the Hessenberg form instead of the original dense matrix.

82.11 Unreduced Hessenberg Matrices

An upper Hessenberg matrix is called unreduced if all its subdiagonal entries are nonzero:

h21,h32,,hn,n10. h_{21}, h_{32}, \ldots, h_{n,n-1} \ne 0.

If some subdiagonal entry is zero, say

hk+1,k=0, h_{k+1,k}=0,

then the matrix splits into block upper triangular form:

H=[H11H120H22]. H = \begin{bmatrix} H_{11} & H_{12} \\ 0 & H_{22} \end{bmatrix}.

The eigenvalues of HH are then the eigenvalues of H11H_{11} together with the eigenvalues of H22H_{22}. The eigenvalue problem separates into smaller problems.

This splitting is called deflation in eigenvalue algorithms.

82.12 Deflation

Deflation occurs when a subdiagonal entry becomes zero or small enough to treat as zero.

Suppose

hk+1,k0. h_{k+1,k} \approx 0.

Then the matrix is nearly block upper triangular:

H[H11H120H22]. H \approx \begin{bmatrix} H_{11} & H_{12} \\ 0 & H_{22} \end{bmatrix}.

The QR algorithm can then continue separately on the diagonal blocks. This reduces the active problem size.

In practice, deflation is one of the mechanisms that makes QR eigenvalue computation efficient. The algorithm gradually drives subdiagonal entries toward zero, allowing eigenvalues or small blocks to split off.

82.13 Real Hessenberg Form and Complex Eigenvalues

A real upper Hessenberg matrix can have complex eigenvalues. Since the matrix entries remain real, complex eigenvalues appear in conjugate pairs.

In the real QR algorithm, convergence usually produces a real Schur form rather than a fully diagonal matrix. The final form contains 1×11 \times 1 blocks for real eigenvalues and 2×22 \times 2 blocks for complex conjugate pairs.

Hessenberg form is therefore an intermediate form. It is not the final eigenvalue form. It is the structured form from which the QR algorithm efficiently approaches Schur form.

82.14 Algorithm

The following high-level algorithm reduces a real square matrix to upper Hessenberg form.

hessenberg_reduction(A):
    n = number of rows of A
    Q = identity matrix of size n

    for k = 1 to n - 2:
        choose a Householder reflector Pk
        that zeros A[k+2:n, k]

        A = Pk A Pk
        Q = Q Pk

    H = A
    return Q, H

The result satisfies

H=QTAQ. H = Q^TAQ.

Equivalently,

A=QHQT. A = QHQ^T.

In practical implementations, the full Householder matrices PkP_k are not formed explicitly. Their defining vectors are stored compactly and applied to the relevant submatrices.

82.15 Storage in Practice

Dense Hessenberg reduction commonly overwrites the original matrix.

The upper Hessenberg part stores HH. The Householder vectors used to construct QQ may be stored in the entries below the first subdiagonal, since those entries are zero in HH.

ObjectPractical storage
HHUpper Hessenberg part of the array
Householder vectorsStrictly lower unused region
QQStored implicitly
EigenvaluesLater extracted from Schur form

This is similar in spirit to Householder QR storage. The orthogonal transformations are stored implicitly unless the explicit matrix QQ is needed.

82.16 Hessenberg Form and Krylov Subspaces

Hessenberg matrices also appear in Krylov subspace methods.

Given a matrix AA and a vector bb, the Krylov subspace of order kk is

Kk(A,b)=span{b,Ab,A2b,,Ak1b}. \mathcal{K}_k(A,b) = \operatorname{span} \{b, Ab, A^2b, \ldots, A^{k-1}b\}.

Arnoldi iteration constructs an orthonormal basis

q1,,qk q_1,\ldots,q_k

for this subspace. In that basis, the projected action of AA is represented by an upper Hessenberg matrix:

AQk=Qk+1Hk. AQ_k = Q_{k+1}\overline{H}_k.

The Hessenberg matrix Hk\overline{H}_k is small compared with AA. It captures how AA acts on the Krylov basis. This is the foundation for many iterative eigenvalue and linear system methods.

82.17 Arnoldi Hessenberg Matrix

In Arnoldi iteration, each new vector is obtained by applying AA to the current basis vector and then orthogonalizing against previous basis vectors.

The recurrence has the form

Aqj=h1jq1+h2jq2++hj+1,jqj+1. Aq_j = h_{1j}q_1 + h_{2j}q_2 + \cdots + h_{j+1,j}q_{j+1}.

Only basis vectors up to qj+1q_{j+1} appear. This is why the coefficient matrix is upper Hessenberg.

The Hessenberg structure encodes the short upper limit in the recurrence. It records that the jj-th image AqjAq_j may create one new direction, but no more than one.

82.18 Comparison with Other Forms

FormPatternMain use
Upper triangularZeros below diagonalSchur form, triangular solves
Upper HessenbergZeros below first subdiagonalGeneral eigenvalue computation
TridiagonalZeros outside three diagonalsSymmetric eigenvalue computation
BidiagonalNonzeros on main and one adjacent diagonalSingular value computation
DiagonalNonzeros only on main diagonalFully decoupled action

Hessenberg form is a compromise. It is more structured than a dense matrix but less restrictive than triangular form. That compromise makes it reachable by stable transformations and useful for iterative eigenvalue algorithms.

82.19 Common Pitfalls

A Hessenberg matrix should not be confused with a triangular matrix. The first subdiagonal is allowed to contain nonzero entries.

Hessenberg reduction should also not be confused with Gaussian elimination. Gaussian elimination uses row operations and changes the eigenvalues in general. Hessenberg reduction uses similarity transformations and preserves eigenvalues.

Finally, Hessenberg form does not usually reveal eigenvalues directly. It prepares the matrix for QR iteration. The eigenvalues are obtained after further reduction toward Schur form.

82.20 Summary

An upper Hessenberg matrix has zeros below the first subdiagonal:

hij=0fori>j+1. h_{ij}=0 \qquad \text{for} \qquad i>j+1.

Every square matrix can be reduced to upper Hessenberg form by orthogonal similarity transformations:

H=QTAQ. H = Q^TAQ.

This reduction preserves eigenvalues and improves numerical behavior. Its main use is as a preprocessing step for the QR algorithm. Since QR iteration preserves Hessenberg structure, each iteration becomes much cheaper than it would be on a dense matrix.

For symmetric matrices, Hessenberg form becomes tridiagonal form. In Krylov methods, Hessenberg matrices arise naturally from Arnoldi iteration. Thus Hessenberg form connects direct eigenvalue algorithms, iterative methods, and the structural theory of matrix computation.