# Chapter 53. QR Factorization

# Chapter 53. QR Factorization

QR factorization writes a matrix as a product

$$
A = QR,
$$

where \(Q\) has orthonormal columns and \(R\) is upper triangular. It is the matrix form of Gram-Schmidt orthogonalization. The columns of \(Q\) give an orthonormal basis for the column space of \(A\), while \(R\) records the coefficients needed to reconstruct the original columns. Standard references define QR decomposition as a factorization of a matrix into an orthogonal or orthonormal factor and an upper triangular factor.

QR factorization is important because it replaces a possibly awkward basis by an orthonormal basis. This improves geometric clarity and numerical stability. It is used for least squares, eigenvalue algorithms, rank estimation, and many stable matrix computations.

## 53.1 The Factorization

Let

$$
A\in\mathbb{R}^{m\times n},
\qquad
m\ge n.
$$

Assume the columns of \(A\) are linearly independent. A reduced QR factorization of \(A\) is

$$
A=QR,
$$

where

$$
Q\in\mathbb{R}^{m\times n},
\qquad
R\in\mathbb{R}^{n\times n}.
$$

The columns of \(Q\) are orthonormal:

$$
Q^TQ=I_n.
$$

The matrix \(R\) is upper triangular:

$$
R=
\begin{bmatrix}
r_{11} & r_{12} & \cdots & r_{1n}\\
0 & r_{22} & \cdots & r_{2n}\\
\vdots & \vdots & \ddots & \vdots\\
0 & 0 & \cdots & r_{nn}
\end{bmatrix}.
$$

When the diagonal entries of \(R\) are chosen positive, the reduced QR factorization is unique for a full-column-rank matrix.

## 53.2 Full and Reduced QR

There are two common forms.

The reduced QR factorization is

$$
A=Q_1R_1,
$$

where

$$
Q_1\in\mathbb{R}^{m\times n},
\qquad
R_1\in\mathbb{R}^{n\times n}.
$$

The columns of \(Q_1\) form an orthonormal basis for \(\operatorname{Col}(A)\).

The full QR factorization is

$$
A=QR,
$$

where

$$
Q\in\mathbb{R}^{m\times m}
$$

is orthogonal and

$$
R\in\mathbb{R}^{m\times n}
$$

has the block form

$$
R=
\begin{bmatrix}
R_1\\
0
\end{bmatrix}.
$$

Thus

$$
A =
\begin{bmatrix}
Q_1 & Q_2
\end{bmatrix}
\begin{bmatrix}
R_1\\
0
\end{bmatrix} =
Q_1R_1.
$$

The reduced factorization is usually enough for least squares and column-space computations. The full factorization is useful when a complete orthonormal basis of \(\mathbb{R}^m\) is needed.

## 53.3 Column Interpretation

Write the columns of \(A\) as

$$
A=
\begin{bmatrix}
| & | & & |\\
a_1 & a_2 & \cdots & a_n\\
| & | & & |
\end{bmatrix}.
$$

Write the columns of \(Q\) as

$$
Q=
\begin{bmatrix}
| & | & & |\\
q_1 & q_2 & \cdots & q_n\\
| & | & & |
\end{bmatrix}.
$$

The equation

$$
A=QR
$$

means that each column \(a_j\) is a linear combination of the first \(j\) columns of \(Q\):

$$
a_j =
r_{1j}q_1+r_{2j}q_2+\cdots+r_{jj}q_j.
$$

There are no terms involving \(q_{j+1},\ldots,q_n\) because \(R\) is upper triangular.

Thus \(R\) records how the original columns of \(A\) are expressed in the orthonormal basis \(q_1,\ldots,q_n\).

## 53.4 Relation to Gram-Schmidt

QR factorization is obtained by applying Gram-Schmidt to the columns of \(A\).

Let

$$
a_1,\ldots,a_n
$$

be the columns of \(A\). Gram-Schmidt produces orthonormal vectors

$$
q_1,\ldots,q_n.
$$

For each \(j\), the vector \(a_j\) lies in

$$
\operatorname{span}(q_1,\ldots,q_j).
$$

Therefore

$$
a_j =
r_{1j}q_1+\cdots+r_{jj}q_j.
$$

The coefficient

$$
r_{ij}
$$

is given by

$$
r_{ij}=q_i^Ta_j
$$

for \(i\le j\). The diagonal coefficient is

$$
r_{jj}=\|u_j\|_2,
$$

where \(u_j\) is the orthogonal residual produced by Gram-Schmidt at step \(j\).

Putting these column equations together gives

$$
A=QR.
$$

## 53.5 Computing \(R\) from \(Q\)

If

$$
A=QR
$$

and

$$
Q^TQ=I,
$$

then multiplying on the left by \(Q^T\) gives

$$
Q^TA=Q^TQR.
$$

Since

$$
Q^TQ=I,
$$

we obtain

$$
R=Q^TA.
$$

Thus, once \(Q\) is known, \(R\) is obtained by taking inner products between the columns of \(Q\) and the columns of \(A\).

Entrywise,

$$
r_{ij}=q_i^Ta_j.
$$

For a proper QR factorization, these entries vanish below the diagonal:

$$
r_{ij}=0
\qquad
\text{when } i>j.
$$

This expresses the same orthogonality created by Gram-Schmidt.

## 53.6 Example

Let

$$
A=
\begin{bmatrix}
1 & 1\\
1 & 0
\end{bmatrix}.
$$

The columns are

$$
a_1=
\begin{bmatrix}
1\\
1
\end{bmatrix},
\qquad
a_2=
\begin{bmatrix}
1\\
0
\end{bmatrix}.
$$

First,

$$
r_{11}=\|a_1\|_2=\sqrt{2},
$$

so

$$
q_1=
\frac{a_1}{r_{11}} =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
1
\end{bmatrix}.
$$

Next,

$$
r_{12}=q_1^Ta_2 =
\frac{1}{\sqrt{2}}.
$$

Remove the component of \(a_2\) in the \(q_1\) direction:

$$
u_2=a_2-r_{12}q_1.
$$

Thus

$$
u_2=
\begin{bmatrix}
1\\
0
\end{bmatrix} -
\frac{1}{2}
\begin{bmatrix}
1\\
1
\end{bmatrix} =
\begin{bmatrix}
1/2\\
-1/2
\end{bmatrix}.
$$

Then

$$
r_{22}=\|u_2\|_2 =
\frac{1}{\sqrt{2}},
$$

and

$$
q_2=
\frac{u_2}{r_{22}} =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
-1
\end{bmatrix}.
$$

Therefore

$$
Q=
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 & 1\\
1 & -1
\end{bmatrix},
$$

and

$$
R=
\begin{bmatrix}
\sqrt{2} & 1/\sqrt{2}\\
0 & 1/\sqrt{2}
\end{bmatrix}.
$$

Indeed,

$$
QR =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1 & 1\\
1 & -1
\end{bmatrix}
\begin{bmatrix}
\sqrt{2} & 1/\sqrt{2}\\
0 & 1/\sqrt{2}
\end{bmatrix} =
\begin{bmatrix}
1 & 1\\
1 & 0
\end{bmatrix}.
$$

## 53.7 QR and Least Squares

QR factorization gives a stable way to solve least squares problems.

Consider

$$
\min_x \|Ax-b\|_2.
$$

Assume

$$
A=QR,
$$

where \(Q\) has orthonormal columns and \(R\) is invertible upper triangular.

Then

$$
Ax=QRx.
$$

The least squares solution satisfies

$$
R\hat{x}=Q^Tb.
$$

Thus

$$
\hat{x}=R^{-1}Q^Tb.
$$

In computation, one solves

$$
R\hat{x}=Q^Tb
$$

by back substitution. One does not form \(R^{-1}\) explicitly.

This method avoids solving the normal equations

$$
A^TA\hat{x}=A^Tb.
$$

That matters because forming \(A^TA\) squares the condition number in the Euclidean norm, while QR methods avoid this condition-number-squaring effect.

## 53.8 Projection Using QR

If \(A=QR\) and \(A\) has full column rank, then

$$
\operatorname{Col}(A)=\operatorname{Col}(Q).
$$

The orthogonal projection of \(b\) onto \(\operatorname{Col}(A)\) is therefore

$$
p=QQ^Tb.
$$

The residual is

$$
r=b-p=b-QQ^Tb.
$$

The least squares fitted vector is

$$
A\hat{x}=p.
$$

Because \(Q\) has orthonormal columns, this projection formula is simpler than

$$
p=A(A^TA)^{-1}A^Tb.
$$

QR factorization replaces the original column basis by an orthonormal basis, making projection direct.

## 53.9 Solving Square Linear Systems

If \(A\in\mathbb{R}^{n\times n}\) is nonsingular and

$$
A=QR,
$$

then the system

$$
Ax=b
$$

becomes

$$
QRx=b.
$$

Multiply by \(Q^T\):

$$
Rx=Q^Tb.
$$

Since \(R\) is upper triangular, \(x\) is found by back substitution.

This gives an alternative to Gaussian elimination. QR is often more expensive than LU factorization for square systems, but it has better orthogonality properties and is useful when stability is important.

## 53.10 Classical Gram-Schmidt QR

Classical Gram-Schmidt computes \(Q\) and \(R\) by projecting each column of \(A\) onto all previously computed \(q_i\).

For \(j=1,\ldots,n\), set

$$
u_j=a_j.
$$

For \(i=1,\ldots,j-1\), compute

$$
r_{ij}=q_i^Ta_j.
$$

Then subtract

$$
u_j =
a_j-\sum_{i=1}^{j-1}r_{ij}q_i.
$$

Finally,

$$
r_{jj}=\|u_j\|_2,
\qquad
q_j=\frac{u_j}{r_{jj}}.
$$

This method is conceptually simple. In exact arithmetic it produces a valid QR factorization. In floating point arithmetic it can lose orthogonality when the columns of \(A\) are nearly linearly dependent.

## 53.11 Modified Gram-Schmidt QR

Modified Gram-Schmidt improves numerical behavior by subtracting projections from the current residual rather than from the original column.

For each column \(a_j\), begin with

$$
u=a_j.
$$

For \(i=1,\ldots,j-1\), compute

$$
r_{ij}=q_i^Tu,
$$

then update

$$
u \leftarrow u-r_{ij}q_i.
$$

After all previous directions have been removed,

$$
r_{jj}=\|u\|_2,
\qquad
q_j=\frac{u}{r_{jj}}.
$$

Classical and modified Gram-Schmidt are equivalent in exact arithmetic. In finite precision arithmetic, modified Gram-Schmidt usually preserves orthogonality better.

## 53.12 Householder QR

Householder QR uses orthogonal reflections to zero out subdiagonal entries of a matrix.

A Householder reflector has the form

$$
H=I-2vv^T,
$$

where

$$
\|v\|_2=1.
$$

It is orthogonal:

$$
H^TH=I.
$$

It is also symmetric:

$$
H^T=H.
$$

By choosing \(v\) appropriately, a Householder reflector transforms a vector into a multiple of a coordinate vector. Applied successively to columns of \(A\), these reflectors produce an upper triangular matrix \(R\). The product of the reflectors gives the orthogonal factor \(Q\). Householder transformations are a standard stable method for computing QR factorization.

## 53.13 Givens QR

Givens rotations compute QR factorization by zeroing individual subdiagonal entries.

A Givens rotation acts on two coordinates at a time. In the plane of those two coordinates, it has the form

$$
\begin{bmatrix}
c & s\\
-s & c
\end{bmatrix},
$$

where

$$
c^2+s^2=1.
$$

By choosing \(c\) and \(s\) correctly, one can eliminate a chosen entry while preserving the Euclidean norm.

Givens rotations are especially useful for sparse matrices and for problems where entries are introduced or removed incrementally. They can zero selected entries without applying a dense transformation to the whole matrix. QR factorization can be computed by a sequence of Givens rotations.

## 53.14 Comparison of QR Methods

| Method | Main idea | Strength | Weakness |
|---|---|---|---|
| Classical Gram-Schmidt | Subtract projections from original columns | Simple theory | Can lose orthogonality |
| Modified Gram-Schmidt | Subtract projections from current residuals | Better numerical behavior | Still less stable than Householder |
| Householder reflections | Zero whole column tails by reflections | Stable dense QR | Less convenient for selective updates |
| Givens rotations | Zero one entry at a time | Good for sparse and incremental problems | More rotations for dense matrices |

All methods produce the same mathematical kind of factorization. Their differences matter in floating point arithmetic and large-scale computation.

## 53.15 Rank-Deficient Matrices

If the columns of \(A\) are linearly dependent, then the reduced QR factorization with invertible \(R\) does not exist.

Gram-Schmidt will produce a zero residual at some step:

$$
r_{jj}=0.
$$

This means the current column adds no new direction beyond the previous columns.

For rank-deficient matrices, one may use QR factorization with column pivoting:

$$
AP=QR,
$$

where \(P\) is a permutation matrix. Pivoting reorders the columns so that the more independent columns appear earlier.

This form is useful for detecting numerical rank and constructing rank-revealing factorizations. QR methods can be used for numerical rank estimation at lower cost than a full singular value decomposition in some settings.

## 53.16 Positive Diagonal Convention

The signs of the columns of \(Q\) are not fixed by orthonormality alone.

If

$$
A=QR,
$$

then changing the sign of a column \(q_j\) and the sign of the corresponding row of \(R\) gives another valid factorization.

To remove this ambiguity, one often requires

$$
r_{jj}>0
$$

for all \(j\). Under full column rank, this convention gives uniqueness for the reduced QR factorization.

## 53.17 Complex QR Factorization

For a complex matrix

$$
A\in\mathbb{C}^{m\times n},
$$

QR factorization has the form

$$
A=QR,
$$

where \(Q\) has orthonormal columns in the complex inner product and \(R\) is upper triangular.

The orthonormality condition becomes

$$
Q^*Q=I,
$$

where \(Q^*\) is the conjugate transpose.

The coefficients are

$$
r_{ij}=q_i^*a_j.
$$

If \(Q\) is square, it is unitary:

$$
Q^*Q=QQ^*=I.
$$

Complex QR appears in spectral theory, signal processing, quantum mechanics, and numerical algorithms for complex matrices.

## 53.18 QR and Orthogonal Changes of Coordinates

The equation

$$
A=QR
$$

can be rewritten as

$$
Q^TA=R.
$$

Thus \(Q^T\) applies an orthogonal change of coordinates to the columns of \(A\), turning the matrix into upper triangular form.

Because \(Q\) is orthogonal, this change of coordinates preserves lengths and angles:

$$
\|Q^Tx\|_2=\|x\|_2.
$$

This explains the numerical value of QR. It transforms the problem while preserving Euclidean geometry.

## 53.19 QR Algorithm Preview

QR factorization is also the basis of the QR algorithm for eigenvalues.

Given a square matrix \(A_0\), one forms

$$
A_k=Q_kR_k,
$$

then reverses the factors:

$$
A_{k+1}=R_kQ_k.
$$

Under suitable hypotheses and with shifts, this iteration reveals eigenvalue information. The QR algorithm is one of the central methods for computing eigenvalues of dense matrices. QR decomposition is widely used both for least squares and as the basis of the QR eigenvalue algorithm.

## 53.20 Summary

QR factorization writes a matrix as

$$
A=QR.
$$

The matrix \(Q\) has orthonormal columns:

$$
Q^TQ=I.
$$

The matrix \(R\) is upper triangular.

The columns of \(Q\) form an orthonormal basis for the column space of \(A\). The matrix \(R\) records the coordinates of the original columns of \(A\) in that basis.

QR factorization is the matrix form of Gram-Schmidt. It gives stable methods for least squares, projections, and linear systems. It can be computed by Gram-Schmidt, Householder reflections, or Givens rotations. Its central advantage is that it uses orthogonal transformations, which preserve Euclidean geometry and avoid the conditioning problems caused by forming \(A^TA\).