Chapter 53. QR Factorization

QR factorization writes a matrix as a product

A = QR,

where $Q$ has orthonormal columns and $R$ is upper triangular. It is the matrix form of Gram-Schmidt orthogonalization. The columns of $Q$ give an orthonormal basis for the column space of $A$ , while $R$ records the coefficients needed to reconstruct the original columns. Standard references define QR decomposition as a factorization of a matrix into an orthogonal or orthonormal factor and an upper triangular factor.

QR factorization is important because it replaces a possibly awkward basis by an orthonormal basis. This improves geometric clarity and numerical stability. It is used for least squares, eigenvalue algorithms, rank estimation, and many stable matrix computations.

53.1 The Factorization

Let

A\in\mathbb{R}^{m\times n}, \qquad m\ge n.

Assume the columns of $A$ are linearly independent. A reduced QR factorization of $A$ is

A=QR,

where

Q\in\mathbb{R}^{m\times n}, \qquad R\in\mathbb{R}^{n\times n}.

The columns of $Q$ are orthonormal:

Q^TQ=I_n.

The matrix $R$ is upper triangular:

R= \begin{bmatrix} r_{11} & r_{12} & \cdots & r_{1n}\\ 0 & r_{22} & \cdots & r_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & r_{nn} \end{bmatrix}.

When the diagonal entries of $R$ are chosen positive, the reduced QR factorization is unique for a full-column-rank matrix.

53.2 Full and Reduced QR

There are two common forms.

The reduced QR factorization is

A=Q_1R_1,

where

Q_1\in\mathbb{R}^{m\times n}, \qquad R_1\in\mathbb{R}^{n\times n}.

The columns of $Q_1$ form an orthonormal basis for $\operatorname{Col}(A)$ .

The full QR factorization is

A=QR,

where

Q\in\mathbb{R}^{m\times m}

is orthogonal and

R\in\mathbb{R}^{m\times n}

has the block form

R= \begin{bmatrix} R_1\\ 0 \end{bmatrix}.

Thus

A = \begin{bmatrix} Q_1 & Q_2 \end{bmatrix} \begin{bmatrix} R_1\\ 0 \end{bmatrix} = Q_1R_1.

The reduced factorization is usually enough for least squares and column-space computations. The full factorization is useful when a complete orthonormal basis of $\mathbb{R}^m$ is needed.

53.3 Column Interpretation

Write the columns of $A$ as

A= \begin{bmatrix} | & | & & |\\ a_1 & a_2 & \cdots & a_n\\ | & | & & | \end{bmatrix}.

Write the columns of $Q$ as

Q= \begin{bmatrix} | & | & & |\\ q_1 & q_2 & \cdots & q_n\\ | & | & & | \end{bmatrix}.

The equation

A=QR

means that each column $a_j$ is a linear combination of the first $j$ columns of $Q$ :

a_j = r_{1j}q_1+r_{2j}q_2+\cdots+r_{jj}q_j.

There are no terms involving $q_{j+1},\ldots,q_n$ because $R$ is upper triangular.

Thus $R$ records how the original columns of $A$ are expressed in the orthonormal basis $q_1,\ldots,q_n$ .

53.4 Relation to Gram-Schmidt

QR factorization is obtained by applying Gram-Schmidt to the columns of $A$ .

Let

a_1,\ldots,a_n

be the columns of $A$ . Gram-Schmidt produces orthonormal vectors

q_1,\ldots,q_n.

For each $j$ , the vector $a_j$ lies in

\operatorname{span}(q_1,\ldots,q_j).

Therefore

a_j = r_{1j}q_1+\cdots+r_{jj}q_j.

The coefficient

r_{ij}

is given by

r_{ij}=q_i^Ta_j

for $i\le j$ . The diagonal coefficient is

r_{jj}=\|u_j\|_2,

where $u_j$ is the orthogonal residual produced by Gram-Schmidt at step $j$ .

Putting these column equations together gives

A=QR.

53.5 Computing $R$ from $Q$

A=QR

and

Q^TQ=I,

then multiplying on the left by $Q^T$ gives

Q^TA=Q^TQR.

Since

Q^TQ=I,

we obtain

R=Q^TA.

Thus, once $Q$ is known, $R$ is obtained by taking inner products between the columns of $Q$ and the columns of $A$ .

Entrywise,

r_{ij}=q_i^Ta_j.

For a proper QR factorization, these entries vanish below the diagonal:

r_{ij}=0 \qquad \text{when } i>j.

This expresses the same orthogonality created by Gram-Schmidt.

53.6 Example

Let

A= \begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}.

The columns are

a_1= \begin{bmatrix} 1\\ 1 \end{bmatrix}, \qquad a_2= \begin{bmatrix} 1\\ 0 \end{bmatrix}.

First,

r_{11}=\|a_1\|_2=\sqrt{2},

q_1= \frac{a_1}{r_{11}} = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ 1 \end{bmatrix}.

Next,

r_{12}=q_1^Ta_2 = \frac{1}{\sqrt{2}}.

Remove the component of $a_2$ in the $q_1$ direction:

u_2=a_2-r_{12}q_1.

Thus

u_2= \begin{bmatrix} 1\\ 0 \end{bmatrix} - \frac{1}{2} \begin{bmatrix} 1\\ 1 \end{bmatrix} = \begin{bmatrix} 1/2\\ -1/2 \end{bmatrix}.

Then

r_{22}=\|u_2\|_2 = \frac{1}{\sqrt{2}},

and

q_2= \frac{u_2}{r_{22}} = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ -1 \end{bmatrix}.

Therefore

Q= \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix},

and

R= \begin{bmatrix} \sqrt{2} & 1/\sqrt{2}\\ 0 & 1/\sqrt{2} \end{bmatrix}.

Indeed,

QR = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1\\ 1 & -1 \end{bmatrix} \begin{bmatrix} \sqrt{2} & 1/\sqrt{2}\\ 0 & 1/\sqrt{2} \end{bmatrix} = \begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}.

53.7 QR and Least Squares

QR factorization gives a stable way to solve least squares problems.

Consider

\min_x \|Ax-b\|_2.

Assume

A=QR,

where $Q$ has orthonormal columns and $R$ is invertible upper triangular.

Then

Ax=QRx.

The least squares solution satisfies

R\hat{x}=Q^Tb.

Thus

\hat{x}=R^{-1}Q^Tb.

In computation, one solves

R\hat{x}=Q^Tb

by back substitution. One does not form $R^{-1}$ explicitly.

This method avoids solving the normal equations

A^TA\hat{x}=A^Tb.

That matters because forming $A^TA$ squares the condition number in the Euclidean norm, while QR methods avoid this condition-number-squaring effect.

53.8 Projection Using QR

If $A=QR$ and $A$ has full column rank, then

\operatorname{Col}(A)=\operatorname{Col}(Q).

The orthogonal projection of $b$ onto $\operatorname{Col}(A)$ is therefore

p=QQ^Tb.

The residual is

r=b-p=b-QQ^Tb.

The least squares fitted vector is

A\hat{x}=p.

Because $Q$ has orthonormal columns, this projection formula is simpler than

p=A(A^TA)^{-1}A^Tb.

QR factorization replaces the original column basis by an orthonormal basis, making projection direct.

53.9 Solving Square Linear Systems

If $A\in\mathbb{R}^{n\times n}$ is nonsingular and

A=QR,

then the system

Ax=b

becomes

QRx=b.

Multiply by $Q^T$ :

Rx=Q^Tb.

Since $R$ is upper triangular, $x$ is found by back substitution.

This gives an alternative to Gaussian elimination. QR is often more expensive than LU factorization for square systems, but it has better orthogonality properties and is useful when stability is important.

53.10 Classical Gram-Schmidt QR

Classical Gram-Schmidt computes $Q$ and $R$ by projecting each column of $A$ onto all previously computed $q_i$ .

For $j=1,\ldots,n$ , set

u_j=a_j.

For $i=1,\ldots,j-1$ , compute

r_{ij}=q_i^Ta_j.

Then subtract

u_j = a_j-\sum_{i=1}^{j-1}r_{ij}q_i.

Finally,

r_{jj}=\|u_j\|_2, \qquad q_j=\frac{u_j}{r_{jj}}.

This method is conceptually simple. In exact arithmetic it produces a valid QR factorization. In floating point arithmetic it can lose orthogonality when the columns of $A$ are nearly linearly dependent.

53.11 Modified Gram-Schmidt QR

Modified Gram-Schmidt improves numerical behavior by subtracting projections from the current residual rather than from the original column.

For each column $a_j$ , begin with

u=a_j.

For $i=1,\ldots,j-1$ , compute

r_{ij}=q_i^Tu,

then update

u \leftarrow u-r_{ij}q_i.

After all previous directions have been removed,

r_{jj}=\|u\|_2, \qquad q_j=\frac{u}{r_{jj}}.

Classical and modified Gram-Schmidt are equivalent in exact arithmetic. In finite precision arithmetic, modified Gram-Schmidt usually preserves orthogonality better.

53.12 Householder QR

Householder QR uses orthogonal reflections to zero out subdiagonal entries of a matrix.

A Householder reflector has the form

H=I-2vv^T,

where

\|v\|_2=1.

It is orthogonal:

H^TH=I.

It is also symmetric:

H^T=H.

By choosing $v$ appropriately, a Householder reflector transforms a vector into a multiple of a coordinate vector. Applied successively to columns of $A$ , these reflectors produce an upper triangular matrix $R$ . The product of the reflectors gives the orthogonal factor $Q$ . Householder transformations are a standard stable method for computing QR factorization.

53.13 Givens QR

Givens rotations compute QR factorization by zeroing individual subdiagonal entries.

A Givens rotation acts on two coordinates at a time. In the plane of those two coordinates, it has the form

\begin{bmatrix} c & s\\ -s & c \end{bmatrix},

where

c^2+s^2=1.

By choosing $c$ and $s$ correctly, one can eliminate a chosen entry while preserving the Euclidean norm.

Givens rotations are especially useful for sparse matrices and for problems where entries are introduced or removed incrementally. They can zero selected entries without applying a dense transformation to the whole matrix. QR factorization can be computed by a sequence of Givens rotations.

53.14 Comparison of QR Methods

Method	Main idea	Strength	Weakness
Classical Gram-Schmidt	Subtract projections from original columns	Simple theory	Can lose orthogonality
Modified Gram-Schmidt	Subtract projections from current residuals	Better numerical behavior	Still less stable than Householder
Householder reflections	Zero whole column tails by reflections	Stable dense QR	Less convenient for selective updates
Givens rotations	Zero one entry at a time	Good for sparse and incremental problems	More rotations for dense matrices

All methods produce the same mathematical kind of factorization. Their differences matter in floating point arithmetic and large-scale computation.

53.15 Rank-Deficient Matrices

If the columns of $A$ are linearly dependent, then the reduced QR factorization with invertible $R$ does not exist.

Gram-Schmidt will produce a zero residual at some step:

r_{jj}=0.

This means the current column adds no new direction beyond the previous columns.

For rank-deficient matrices, one may use QR factorization with column pivoting:

AP=QR,

where $P$ is a permutation matrix. Pivoting reorders the columns so that the more independent columns appear earlier.

This form is useful for detecting numerical rank and constructing rank-revealing factorizations. QR methods can be used for numerical rank estimation at lower cost than a full singular value decomposition in some settings.

53.16 Positive Diagonal Convention

The signs of the columns of $Q$ are not fixed by orthonormality alone.

A=QR,

then changing the sign of a column $q_j$ and the sign of the corresponding row of $R$ gives another valid factorization.

To remove this ambiguity, one often requires

r_{jj}>0

for all $j$ . Under full column rank, this convention gives uniqueness for the reduced QR factorization.

53.17 Complex QR Factorization

For a complex matrix

A\in\mathbb{C}^{m\times n},

QR factorization has the form

A=QR,

where $Q$ has orthonormal columns in the complex inner product and $R$ is upper triangular.

The orthonormality condition becomes

Q^*Q=I,

where $Q^*$ is the conjugate transpose.

The coefficients are

r_{ij}=q_i^*a_j.

If $Q$ is square, it is unitary:

Q^*Q=QQ^*=I.

Complex QR appears in spectral theory, signal processing, quantum mechanics, and numerical algorithms for complex matrices.

53.18 QR and Orthogonal Changes of Coordinates

The equation

A=QR

can be rewritten as

Q^TA=R.

Thus $Q^T$ applies an orthogonal change of coordinates to the columns of $A$ , turning the matrix into upper triangular form.

Because $Q$ is orthogonal, this change of coordinates preserves lengths and angles:

\|Q^Tx\|_2=\|x\|_2.

This explains the numerical value of QR. It transforms the problem while preserving Euclidean geometry.

53.19 QR Algorithm Preview

QR factorization is also the basis of the QR algorithm for eigenvalues.

Given a square matrix $A_0$ , one forms

A_k=Q_kR_k,

then reverses the factors:

A_{k+1}=R_kQ_k.

Under suitable hypotheses and with shifts, this iteration reveals eigenvalue information. The QR algorithm is one of the central methods for computing eigenvalues of dense matrices. QR decomposition is widely used both for least squares and as the basis of the QR eigenvalue algorithm.

53.20 Summary

QR factorization writes a matrix as

A=QR.

The matrix $Q$ has orthonormal columns:

Q^TQ=I.

The matrix $R$ is upper triangular.

The columns of $Q$ form an orthonormal basis for the column space of $A$ . The matrix $R$ records the coordinates of the original columns of $A$ in that basis.

QR factorization is the matrix form of Gram-Schmidt. It gives stable methods for least squares, projections, and linear systems. It can be computed by Gram-Schmidt, Householder reflections, or Givens rotations. Its central advantage is that it uses orthogonal transformations, which preserve Euclidean geometry and avoid the conditioning problems caused by forming $A^TA$ .

Chapter 53. QR Factorization

53.1 The Factorization

53.2 Full and Reduced QR

53.3 Column Interpretation

53.4 Relation to Gram-Schmidt

53.5 Computing RRR from QQQ

53.6 Example

53.7 QR and Least Squares

53.8 Projection Using QR

53.9 Solving Square Linear Systems

53.10 Classical Gram-Schmidt QR

53.11 Modified Gram-Schmidt QR

53.12 Householder QR

53.13 Givens QR

53.14 Comparison of QR Methods

53.15 Rank-Deficient Matrices

53.16 Positive Diagonal Convention

53.17 Complex QR Factorization

53.18 QR and Orthogonal Changes of Coordinates

53.19 QR Algorithm Preview

53.20 Summary

53.5 Computing $R$ from $Q$