# Chapter 78. QR Decomposition

# Chapter 78. QR Decomposition

QR decomposition is a factorization of a matrix into an orthogonal factor and an upper triangular factor. It is one of the central decompositions in numerical linear algebra, especially for least squares problems, orthogonalization, eigenvalue algorithms, and stable solution methods.

For a real matrix \(A\), a QR decomposition has the form

$$
A = QR,
$$

where \(Q\) has orthonormal columns and \(R\) is upper triangular. If \(A\) is a square \(n \times n\) matrix and has full rank, then \(Q\) is an orthogonal matrix and \(R\) is an \(n \times n\) upper triangular matrix. The factorization is commonly computed by Gram-Schmidt orthogonalization, Householder reflections, or Givens rotations.

## 78.1 Orthogonal Matrices

A real square matrix \(Q\) is orthogonal if

$$
Q^TQ = I.
$$

This condition means that the columns of \(Q\) form an orthonormal set. Each column has length \(1\), and distinct columns are perpendicular.

If

$$
Q =
\begin{bmatrix}
q_1 & q_2 & \cdots & q_n
\end{bmatrix},
$$

then orthogonality means

$$
q_i^T q_j =
\begin{cases}
1, & i=j, \\
0, & i \ne j.
\end{cases}
$$

An orthogonal matrix preserves lengths and angles. For any vector \(x\),

$$
\|Qx\| = \|x\|.
$$

This follows from

$$
\|Qx\|^2 =
(Qx)^T(Qx) =
x^TQ^TQx =
x^Tx =
\|x\|^2.
$$

This property makes orthogonal transformations numerically stable. They do not magnify errors merely by changing coordinates.

## 78.2 Upper Triangular Matrices

An upper triangular matrix has zero entries below the diagonal:

$$
R =
\begin{bmatrix}
r_{11} & r_{12} & r_{13} & \cdots & r_{1n} \\
0 & r_{22} & r_{23} & \cdots & r_{2n} \\
0 & 0 & r_{33} & \cdots & r_{3n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
0 & 0 & 0 & \cdots & r_{nn}
\end{bmatrix}.
$$

Triangular systems are easy to solve by substitution. This is one reason QR decomposition is useful. It converts many problems involving \(A\) into problems involving \(R\), while \(Q\) handles an orthogonal change of coordinates.

## 78.3 Full QR Decomposition

Let \(A\) be an \(m \times n\) real matrix with \(m \ge n\). A full QR decomposition writes

$$
A = QR,
$$

where \(Q\) is an \(m \times m\) orthogonal matrix and \(R\) is an \(m \times n\) upper trapezoidal matrix.

The matrix \(R\) has the form

$$
R =
\begin{bmatrix}
R_1 \\
0
\end{bmatrix},
$$

where \(R_1\) is an \(n \times n\) upper triangular matrix.

Thus

$$
A =
Q
\begin{bmatrix}
R_1 \\
0
\end{bmatrix}.
$$

The full form is useful theoretically, but it stores more information than is often needed.

## 78.4 Reduced QR Decomposition

For many applications, the reduced QR decomposition is preferred.

If \(A\) is an \(m \times n\) matrix with \(m \ge n\) and linearly independent columns, then

$$
A = \widehat{Q}R,
$$

where \(\widehat{Q}\) is \(m \times n\) with orthonormal columns and \(R\) is \(n \times n\) upper triangular.

The matrix \(\widehat{Q}\) satisfies

$$
\widehat{Q}^T\widehat{Q} = I_n.
$$

However,

$$
\widehat{Q}\widehat{Q}^T
$$

is generally not the \(m \times m\) identity matrix. Instead, it is the orthogonal projection onto the column space of \(A\).

The reduced decomposition stores only the orthonormal basis needed for the column space of \(A\).

## 78.5 Geometric Meaning

The QR decomposition separates a matrix into two parts:

$$
A = QR.
$$

The factor \(Q\) gives an orthonormal coordinate system for the column space of \(A\). The factor \(R\) gives the coordinates of the columns of \(A\) in that orthonormal system.

Let

$$
A =
\begin{bmatrix}
a_1 & a_2 & \cdots & a_n
\end{bmatrix}
$$

and

$$
Q =
\begin{bmatrix}
q_1 & q_2 & \cdots & q_n
\end{bmatrix}.
$$

The columns \(q_1,\ldots,q_n\) form an orthonormal basis for the same column space as \(a_1,\ldots,a_n\). Each original column \(a_j\) can be written as a combination of the \(q_i\):

$$
a_j = r_{1j}q_1 + r_{2j}q_2 + \cdots + r_{jj}q_j.
$$

Only the first \(j\) vectors appear because \(R\) is upper triangular.

## 78.6 QR from Gram-Schmidt

QR decomposition is closely related to Gram-Schmidt orthogonalization.

Let the columns of \(A\) be

$$
a_1, a_2, \ldots, a_n.
$$

The Gram-Schmidt process constructs orthonormal vectors

$$
q_1, q_2, \ldots, q_n
$$

from these columns.

First,

$$
r_{11} = \|a_1\|,
\qquad
q_1 = \frac{a_1}{r_{11}}.
$$

For the second column, subtract its projection onto \(q_1\):

$$
r_{12} = q_1^Ta_2,
$$

$$
v_2 = a_2 - r_{12}q_1.
$$

Then

$$
r_{22} = \|v_2\|,
\qquad
q_2 = \frac{v_2}{r_{22}}.
$$

In general, for column \(a_j\),

$$
r_{ij} = q_i^Ta_j
\qquad
1 \le i < j,
$$

$$
v_j = a_j - \sum_{i=1}^{j-1} r_{ij}q_i,
$$

$$
r_{jj} = \|v_j\|,
\qquad
q_j = \frac{v_j}{r_{jj}}.
$$

These formulas produce

$$
A = QR.
$$

The entries of \(R\) are the projection coefficients and normalizing constants from the orthogonalization process.

## 78.7 A Two Column Example

Let

$$
A =
\begin{bmatrix}
1 & 1 \\
1 & 0 \\
0 & 1
\end{bmatrix}.
$$

Its columns are

$$
a_1 =
\begin{bmatrix}
1 \\
1 \\
0
\end{bmatrix},
\qquad
a_2 =
\begin{bmatrix}
1 \\
0 \\
1
\end{bmatrix}.
$$

First compute

$$
r_{11} = \|a_1\| = \sqrt{2}.
$$

Thus

$$
q_1 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 \\
1 \\
0
\end{bmatrix}.
$$

Next,

$$
r_{12} = q_1^Ta_2 =
\frac{1}{\sqrt{2}}.
$$

Subtract the projection:

$$
v_2 = a_2 - r_{12}q_1.
$$

Since

$$
r_{12}q_1 =
\frac{1}{\sqrt{2}}
\cdot
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 \\
1 \\
0
\end{bmatrix} =
\frac{1}{2}
\begin{bmatrix}
1 \\
1 \\
0
\end{bmatrix},
$$

we get

$$
v_2 =
\begin{bmatrix}
1 \\
0 \\
1
\end{bmatrix} -
\begin{bmatrix}
1/2 \\
1/2 \\
0
\end{bmatrix} =
\begin{bmatrix}
1/2 \\
-1/2 \\
1
\end{bmatrix}.
$$

Now

$$
r_{22} =
\|v_2\| =
\sqrt{
\frac{1}{4}
+
\frac{1}{4}
+
1
} =
\sqrt{\frac{3}{2}}.
$$

Therefore

$$
q_2 =
\frac{v_2}{r_{22}} =
\sqrt{\frac{2}{3}}
\begin{bmatrix}
1/2 \\
-1/2 \\
1
\end{bmatrix}.
$$

Thus the reduced QR decomposition is

$$
A = QR,
$$

where

$$
Q =
\begin{bmatrix}
1/\sqrt{2} & 1/\sqrt{6} \\
1/\sqrt{2} & -1/\sqrt{6} \\
0 & 2/\sqrt{6}
\end{bmatrix}
$$

and

$$
R =
\begin{bmatrix}
\sqrt{2} & 1/\sqrt{2} \\
0 & \sqrt{3/2}
\end{bmatrix}.
$$

The columns of \(Q\) are orthonormal, and \(R\) is upper triangular.

## 78.8 Solving Square Linear Systems

Suppose \(A\) is a square invertible matrix and

$$
A = QR.
$$

To solve

$$
Ax = b,
$$

substitute the factorization:

$$
QRx = b.
$$

Multiply both sides by \(Q^T\):

$$
Rx = Q^Tb.
$$

Since \(R\) is upper triangular, the system

$$
Rx = Q^Tb
$$

can be solved by backward substitution.

This method is usually more expensive than LU for a general square dense system, but it has strong stability properties because \(Q\) is orthogonal.

## 78.9 Least Squares Problems

QR decomposition is one of the standard methods for solving least squares problems.

Suppose \(A\) is an \(m \times n\) matrix with \(m \ge n\), and suppose the system

$$
Ax \approx b
$$

has more equations than unknowns. The least squares problem asks for \(x\) minimizing

$$
\|Ax-b\|_2.
$$

Let

$$
A = QR
$$

be a reduced QR decomposition. Then

$$
\|Ax-b\|_2 =
\|QRx-b\|_2.
$$

Since multiplication by an orthogonal matrix preserves length, this is equivalent to

$$
\|Rx-Q^Tb\|_2.
$$

Thus the least squares solution satisfies the triangular system

$$
Rx = Q^Tb.
$$

This gives a stable method that avoids forming the normal equations

$$
A^TAx = A^Tb.
$$

Avoiding \(A^TA\) is important because forming it can worsen conditioning.

## 78.10 Projection Interpretation

The least squares problem seeks the vector in the column space of \(A\) closest to \(b\).

If

$$
A = \widehat{Q}R
$$

is the reduced QR decomposition, then the columns of \(\widehat{Q}\) form an orthonormal basis for the column space of \(A\).

The orthogonal projection of \(b\) onto this column space is

$$
\widehat{Q}\widehat{Q}^Tb.
$$

The least squares fitted vector is therefore

$$
Ax = \widehat{Q}\widehat{Q}^Tb.
$$

Since

$$
Ax = \widehat{Q}Rx,
$$

we obtain

$$
\widehat{Q}Rx =
\widehat{Q}\widehat{Q}^Tb.
$$

Multiplying by \(\widehat{Q}^T\) gives

$$
Rx = \widehat{Q}^Tb.
$$

This is the same triangular system obtained by the norm-preservation argument.

## 78.11 Classical and Modified Gram-Schmidt

There are two common Gram-Schmidt algorithms.

Classical Gram-Schmidt computes all projection coefficients against the original vector and then subtracts the combined projection.

Modified Gram-Schmidt subtracts projections one at a time, updating the vector after each subtraction.

Mathematically, both methods produce the same exact result in exact arithmetic. Numerically, modified Gram-Schmidt is usually more stable because it reduces the loss of orthogonality caused by rounding error.

| Method | Main idea | Numerical behavior |
|---|---|---|
| Classical Gram-Schmidt | Subtract combined projection | Less stable |
| Modified Gram-Schmidt | Subtract projections sequentially | More stable |
| Householder QR | Use reflections | More stable for dense matrices |
| Givens QR | Use plane rotations | Useful for sparse or structured updates |

## 78.12 Householder Reflections

A Householder reflection is an orthogonal transformation that reflects vectors across a hyperplane.

It has the form

$$
H = I - 2vv^T,
$$

where

$$
\|v\| = 1.
$$

The matrix \(H\) is symmetric and orthogonal:

$$
H^T = H,
\qquad
H^TH = I.
$$

Householder QR uses a sequence of reflections to introduce zeros below the diagonal. At each step, a reflection is chosen to map part of a column to a multiple of a coordinate vector.

After applying these reflections, the matrix becomes upper triangular:

$$
H_k \cdots H_2H_1A = R.
$$

Since each \(H_i\) is orthogonal,

$$
A = H_1H_2\cdots H_kR.
$$

Thus

$$
Q = H_1H_2\cdots H_k.
$$

Householder QR is a standard dense QR method because it is stable and efficient.

## 78.13 Givens Rotations

A Givens rotation is an orthogonal transformation that acts on only two coordinates at a time.

In two dimensions, it has the form

$$
G =
\begin{bmatrix}
c & s \\
-s & c
\end{bmatrix},
$$

where

$$
c^2+s^2=1.
$$

A Givens rotation can be chosen to zero one selected entry of a vector. Applied repeatedly, these rotations can transform a matrix into upper triangular form.

Givens rotations are especially useful when the matrix is sparse. Since each rotation affects only two rows, the method can preserve more structure than a dense Householder transformation.

They are also useful for updating QR decompositions when rows or columns are added or changed.

## 78.14 Uniqueness

QR decomposition is not completely unique unless a sign convention is imposed.

If

$$
A = QR,
$$

then one may multiply a column of \(Q\) by \(-1\) and the corresponding row of \(R\) by \(-1\). The product remains unchanged.

To make the reduced QR decomposition unique for a full column rank matrix, one usually requires the diagonal entries of \(R\) to be positive:

$$
r_{ii} > 0.
$$

Under this convention, the reduced QR decomposition is unique.

## 78.15 Rank Deficiency

If the columns of \(A\) are linearly dependent, then some diagonal entry of \(R\) becomes zero in exact arithmetic.

For a full column rank matrix, \(R\) is nonsingular. For a rank deficient matrix, \(R\) is singular.

Rank deficiency creates difficulties in solving least squares problems because the minimizer may fail to be unique. In such cases, pivoted QR decomposition is often used.

A column-pivoted QR decomposition has the form

$$
AP = QR,
$$

where \(P\) is a permutation matrix that reorders the columns of \(A\). The pivoting helps reveal numerical rank.

## 78.16 Determinants from QR

For a square matrix

$$
A = QR,
$$

the determinant satisfies

$$
\det(A) = \det(Q)\det(R).
$$

For an orthogonal matrix,

$$
\det(Q)=1
\quad \text{or} \quad
\det(Q)=-1.
$$

Since \(R\) is triangular,

$$
\det(R)=\prod_{i=1}^{n} r_{ii}.
$$

Therefore

$$
\det(A) =
\det(Q)
\prod_{i=1}^{n} r_{ii}.
$$

If only the absolute value is needed,

$$
|\det(A)| =
\prod_{i=1}^{n} |r_{ii}|.
$$

For large matrices, one often computes

$$
\log |\det(A)| =
\sum_{i=1}^{n} \log |r_{ii}|.
$$

## 78.17 QR Algorithm for Eigenvalues

QR decomposition also appears in the QR algorithm for computing eigenvalues.

The basic iteration starts with

$$
A_0 = A.
$$

At step \(k\), compute

$$
A_k = Q_kR_k.
$$

Then form

$$
A_{k+1} = R_kQ_k.
$$

Since

$$
A_{k+1} =
R_kQ_k =
Q_k^TA_kQ_k,
$$

the matrices \(A_k\) are orthogonally similar. Therefore they have the same eigenvalues.

Under suitable conditions and with practical enhancements, the sequence tends toward a form from which eigenvalues can be read. This is one of the major uses of QR decomposition in numerical linear algebra.

## 78.18 Algorithm: Modified Gram-Schmidt QR

The following algorithm computes a reduced QR decomposition using modified Gram-Schmidt.

```text
modified_gram_schmidt_qr(A):
    m, n = shape of A

    Q = zero matrix of size m by n
    R = zero matrix of size n by n

    for j = 1 to n:
        v = A[:,j]

        for i = 1 to j - 1:
            R[i,j] = Q[:,i]^T v
            v = v - R[i,j] * Q[:,i]

        R[j,j] = norm(v)

        if R[j,j] == 0:
            stop: columns are linearly dependent

        Q[:,j] = v / R[j,j]

    return Q, R
```

This algorithm is easy to understand and useful for teaching. For high-quality dense numerical software, Householder QR is often preferred.

## 78.19 Storage in Practice

QR decomposition is often stored implicitly.

A dense implementation using Householder reflections may store the upper triangular part of \(R\) in the upper triangular part of the original matrix. The Householder vectors are stored below the diagonal. The matrix \(Q\) is not always formed explicitly.

| Object | Practical storage |
|---|---|
| \(R\) | Upper triangular part of the array |
| \(Q\) | Product of stored Householder reflectors |
| Householder vectors | Lower triangular part of the array |
| Reduced \(Q\) | Formed only when needed |

This storage model saves memory and improves performance. Many algorithms only need to multiply by \(Q\) or \(Q^T\), so forming \(Q\) explicitly would be unnecessary work.

## 78.20 Comparison with LU and Cholesky

| Decomposition | Form | Applies to | Main advantage |
|---|---|---|---|
| LU | \(PA = LU\) | General square matrices | Fast direct solves |
| Cholesky | \(A = LL^T\) | Symmetric positive definite matrices | Faster SPD solves |
| QR | \(A = QR\) | Square or rectangular matrices | Orthogonality and least squares |
| SVD | \(A = U\Sigma V^T\) | General matrices | Rank and conditioning information |

QR is usually more expensive than LU for a square dense solve, but it is more stable in important settings and naturally handles rectangular matrices. It is the standard decomposition behind many reliable least squares methods.

## 78.21 Summary

QR decomposition factors a matrix as

$$
A = QR.
$$

The factor \(Q\) has orthonormal columns. The factor \(R\) is upper triangular. Geometrically, \(Q\) gives an orthonormal basis for the column space of \(A\), while \(R\) gives the coordinates of the original columns in that basis.

QR decomposition is used to solve least squares problems, orthogonalize vectors, solve square systems, update factorizations, estimate rank, and compute eigenvalues. Its strength comes from orthogonality. Since orthogonal transformations preserve lengths and angles, QR-based algorithms tend to have good numerical behavior.

The main computational forms are Gram-Schmidt, Householder reflections, and Givens rotations. Gram-Schmidt explains the idea. Householder reflections are standard for dense matrices. Givens rotations are useful when sparsity or incremental updates matter.