Skip to content

Chapter 38. Projection Operators

A projection operator is a linear operator that leaves its own output unchanged. If VV is a vector space, a projection on VV is a linear map

P:VV P:V\to V

such that

P2=P. P^2=P.

Equivalently,

P(P(v))=P(v) P(P(v))=P(v)

for every vVv\in V. Applying the projection once moves a vector into the target subspace. Applying it again has no further effect. This property is called idempotence. Projection operators are therefore exactly idempotent linear operators.

38.1 First Example

Define

P:R3R3 P:\mathbb{R}^3\to\mathbb{R}^3

by

P[xyz]=[xy0]. P \begin{bmatrix} x\\ y\\ z \end{bmatrix} = \begin{bmatrix} x\\ y\\ 0 \end{bmatrix}.

This map sends every vector to its shadow on the xyxy-plane.

Apply PP twice:

P2[xyz]=P[xy0]=[xy0]. P^2 \begin{bmatrix} x\\ y\\ z \end{bmatrix} = P \begin{bmatrix} x\\ y\\ 0 \end{bmatrix} = \begin{bmatrix} x\\ y\\ 0 \end{bmatrix}.

Thus

P2=P. P^2=P.

So PP is a projection operator.

The image is the xyxy-plane:

im(P)={[xy0]:x,yR}. \operatorname{im}(P)= \left\{ \begin{bmatrix} x\\ y\\ 0 \end{bmatrix} :x,y\in\mathbb{R} \right\}.

The kernel is the zz-axis:

ker(P)={[00z]:zR}. \ker(P)= \left\{ \begin{bmatrix} 0\\ 0\\ z \end{bmatrix} :z\in\mathbb{R} \right\}.

The projection keeps the image and removes the kernel.

38.2 Idempotence

The defining equation

P2=P P^2=P

means

P(P(v))=P(v) P(P(v))=P(v)

for every vector vVv\in V.

This has a direct interpretation. The first application of PP sends vv into im(P)\operatorname{im}(P). Once a vector is in im(P)\operatorname{im}(P), the projection leaves it fixed.

Indeed, if wim(P)w\in\operatorname{im}(P), then there is some vVv\in V such that

w=P(v). w=P(v).

Then

P(w)=P(P(v))=P2(v)=P(v)=w. P(w)=P(P(v))=P^2(v)=P(v)=w.

Thus every vector in the image is fixed by PP.

Conversely, if

P(w)=w, P(w)=w,

then wim(P)w\in\operatorname{im}(P), since ww is the image of itself under PP. Hence

im(P)={wV:P(w)=w}. \operatorname{im}(P)=\{w\in V:P(w)=w\}.

The image of a projection is exactly its fixed subspace.

38.3 Kernel and Image

For every projection P:VVP:V\to V, the kernel and image describe the whole operator.

The kernel is

ker(P)={vV:P(v)=0}. \ker(P)=\{v\in V:P(v)=0\}.

The image is

im(P)={P(v):vV}. \operatorname{im}(P)=\{P(v):v\in V\}.

Every vector vVv\in V can be decomposed as

v=P(v)+(vP(v)). v=P(v)+(v-P(v)).

The first term satisfies

P(v)im(P). P(v)\in\operatorname{im}(P).

The second term lies in the kernel, because

P(vP(v))=P(v)P2(v)=P(v)P(v)=0. P(v-P(v))=P(v)-P^2(v)=P(v)-P(v)=0.

Therefore

v=P(v)+(vP(v)) v=P(v)+(v-P(v))

is a decomposition of vv into a part in the image and a part in the kernel.

Moreover, this decomposition is unique. If

uim(P)ker(P), u\in\operatorname{im}(P)\cap\ker(P),

then uim(P)u\in\operatorname{im}(P) implies

P(u)=u. P(u)=u.

But uker(P)u\in\ker(P) implies

P(u)=0. P(u)=0.

Thus

u=0. u=0.

So

im(P)ker(P)={0}. \operatorname{im}(P)\cap\ker(P)=\{0\}.

Hence

V=im(P)ker(P). V=\operatorname{im}(P)\oplus\ker(P).

Every projection gives a direct sum decomposition of the vector space.

38.4 Projection Onto a Subspace Along a Complement

Let VV be a vector space, and suppose

V=UW. V=U\oplus W.

This means every vector vVv\in V has a unique decomposition

v=u+w, v=u+w,

where

uU,wW. u\in U, \qquad w\in W.

Define

P:VV P:V\to V

by

P(v)=u. P(v)=u.

That is, PP keeps the UU-component and discards the WW-component.

Then PP is linear. If

v1=u1+w1,v2=u2+w2, v_1=u_1+w_1, \qquad v_2=u_2+w_2,

then

v1+v2=(u1+u2)+(w1+w2), v_1+v_2=(u_1+u_2)+(w_1+w_2),

so

P(v1+v2)=u1+u2=P(v1)+P(v2). P(v_1+v_2)=u_1+u_2=P(v_1)+P(v_2).

For a scalar cc,

cv=cu+cw, cv=cu+cw,

so

P(cv)=cu=cP(v). P(cv)=cu=cP(v).

Also,

P2(v)=P(u)=u=P(v). P^2(v)=P(u)=u=P(v).

Thus PP is a projection.

Its image is UU, and its kernel is WW:

im(P)=U,ker(P)=W. \operatorname{im}(P)=U, \qquad \ker(P)=W.

So a projection is the same as a choice of direct sum decomposition.

38.5 The Complementary Projection

If PP is a projection on VV, then

IP I-P

is also a projection.

Compute:

(IP)2=I2P+P2. (I-P)^2=I-2P+P^2.

Since

P2=P, P^2=P,

we get

(IP)2=I2P+P=IP. (I-P)^2=I-2P+P=I-P.

Thus IPI-P is idempotent.

The projection IPI-P keeps the part that PP removes. For any vector vv,

v=P(v)+(IP)(v). v=P(v)+(I-P)(v).

The image of IPI-P is the kernel of PP:

im(IP)=ker(P). \operatorname{im}(I-P)=\ker(P).

The kernel of IPI-P is the image of PP:

ker(IP)=im(P). \ker(I-P)=\operatorname{im}(P).

Thus PP and IPI-P are complementary projections.

38.6 Matrix Projections

A square matrix PP is called a projection matrix if

P2=P. P^2=P.

Such a matrix defines a projection operator

xPx. x\mapsto Px.

For example,

P=[1000] P= \begin{bmatrix} 1 & 0\\ 0 & 0 \end{bmatrix}

satisfies

P2=[1000][1000]=[1000]=P. P^2= \begin{bmatrix} 1 & 0\\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0\\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0\\ 0 & 0 \end{bmatrix} =P.

It projects R2\mathbb{R}^2 onto the xx-axis:

P[xy]=[x0]. P \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} x\\ 0 \end{bmatrix}.

A projection matrix must be square because it represents a linear operator from a vector space to itself.

38.7 Orthogonal Projections

In an inner product space, an important special case is the orthogonal projection.

An orthogonal projection onto a subspace UU writes each vector vv as

v=u+w, v=u+w,

where

uU u\in U

and

wU. w\in U^\perp.

Then

P(v)=u. P(v)=u.

The vector uu is the closest vector in UU to vv. The difference

vP(v) v-P(v)

is perpendicular to UU.

For example, the projection from R3\mathbb{R}^3 onto the xyxy-plane is orthogonal because the removed part lies on the zz-axis, which is perpendicular to the plane.

In real coordinate spaces, an orthogonal projection matrix satisfies

P2=P P^2=P

and

PT=P. P^T=P.

Thus it is both idempotent and symmetric. For complex spaces, symmetry is replaced by self-adjointness:

P=P. P^*=P.

Orthogonal projection matrices are therefore characterized by

P2=P=PT P^2=P=P^T

in the real case, and

P2=P=P P^2=P=P^*

in the complex case.

38.8 Projection Onto a Line

Let uRnu\in\mathbb{R}^n be a nonzero vector. The orthogonal projection of xRnx\in\mathbb{R}^n onto the line spanned by uu is

proju(x)=xuuuu. \operatorname{proj}_u(x)= \frac{x\cdot u}{u\cdot u}u.

The scalar

xuuu \frac{x\cdot u}{u\cdot u}

is the coordinate of the projection along uu.

The corresponding matrix is

P=uuTuTu. P=\frac{uu^T}{u^Tu}.

Indeed,

Px=uuTuTux=uuTxuTu=xuuuu. Px= \frac{uu^T}{u^Tu}x = u\frac{u^Tx}{u^Tu} = \frac{x\cdot u}{u\cdot u}u.

This matrix is symmetric and idempotent, so it is an orthogonal projection matrix.

38.9 Example: Projection Onto a Line in R2\mathbb{R}^2

Let

u=[12]. u= \begin{bmatrix} 1\\ 2 \end{bmatrix}.

Then

uTu=12+22=5. u^Tu=1^2+2^2=5.

Also,

uuT=[12][12]=[1224]. uu^T= \begin{bmatrix} 1\\ 2 \end{bmatrix} \begin{bmatrix} 1 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 2\\ 2 & 4 \end{bmatrix}.

Therefore the projection matrix onto the line spanned by uu is

P=15[1224]. P= \frac{1}{5} \begin{bmatrix} 1 & 2\\ 2 & 4 \end{bmatrix}.

For

x=[31], x= \begin{bmatrix} 3\\ 1 \end{bmatrix},

we get

Px=15[1224][31]=15[510]=[12]. Px= \frac{1}{5} \begin{bmatrix} 1 & 2\\ 2 & 4 \end{bmatrix} \begin{bmatrix} 3\\ 1 \end{bmatrix} = \frac{1}{5} \begin{bmatrix} 5\\ 10 \end{bmatrix} = \begin{bmatrix} 1\\ 2 \end{bmatrix}.

The vector xx projects exactly onto uu. The error vector is

xPx=[31][12]=[21]. x-Px= \begin{bmatrix} 3\\ 1 \end{bmatrix} - \begin{bmatrix} 1\\ 2 \end{bmatrix} = \begin{bmatrix} 2\\ -1 \end{bmatrix}.

Check orthogonality:

[21][12]=22=0. \begin{bmatrix} 2\\ -1 \end{bmatrix} \cdot \begin{bmatrix} 1\\ 2 \end{bmatrix} = 2-2=0.

The error is perpendicular to the line.

38.10 Projection Onto a Subspace with an Orthonormal Basis

Let UU be a subspace of Rn\mathbb{R}^n, and let

q1,,qk q_1,\ldots,q_k

be an orthonormal basis of UU. Then the orthogonal projection of xx onto UU is

PUx=(xq1)q1++(xqk)qk. P_Ux=(x\cdot q_1)q_1+\cdots+(x\cdot q_k)q_k.

If QQ is the matrix with columns

q1,,qk, q_1,\ldots,q_k,

then

QTQ=Ik. Q^TQ=I_k.

The projection matrix is

P=QQT. P=QQ^T.

Indeed,

QQTx=Q[q1TxqkTx]=(q1Tx)q1++(qkTx)qk. QQ^Tx = Q \begin{bmatrix} q_1^Tx\\ \vdots\\ q_k^Tx \end{bmatrix} = (q_1^Tx)q_1+\cdots+(q_k^Tx)q_k.

The matrix QQTQQ^T is symmetric and idempotent:

(QQT)T=QQT, (QQ^T)^T=QQ^T,

and

(QQT)2=Q(QTQ)QT=QIQT=QQT. (QQ^T)^2=Q(Q^TQ)Q^T=QIQ^T=QQ^T.

38.11 Projection Onto a Column Space

Let AA be an m×km\times k matrix with linearly independent columns. The column space of AA is a kk-dimensional subspace of Rm\mathbb{R}^m.

The orthogonal projection onto col(A)\operatorname{col}(A) is

P=A(ATA)1AT. P=A(A^TA)^{-1}A^T.

This formula generalizes the line projection formula. When AA has one column uu, it becomes

P=u(uTu)1uT=uuTuTu. P=u(u^Tu)^{-1}u^T = \frac{uu^T}{u^Tu}.

The matrix ATAA^TA is invertible because the columns of AA are linearly independent.

The projection PxPx is the vector in col(A)\operatorname{col}(A) closest to xx, and the residual

xPx x-Px

is orthogonal to every column of AA. Projection formulas of this kind are central in least squares and regression.

38.12 Derivation of the Column Space Formula

We seek a vector in col(A)\operatorname{col}(A) closest to xx. Such a vector has the form

Ac^ A\hat c

for some coefficient vector c^\hat c.

The residual is

r=xAc^. r=x-A\hat c.

For Ac^A\hat c to be the orthogonal projection, rr must be orthogonal to every column of AA. This condition is

AT(xAc^)=0. A^T(x-A\hat c)=0.

Expanding gives

ATxATAc^=0. A^Tx-A^TA\hat c=0.

So

ATAc^=ATx. A^TA\hat c=A^Tx.

Since ATAA^TA is invertible,

c^=(ATA)1ATx. \hat c=(A^TA)^{-1}A^Tx.

Therefore

Px=Ac^=A(ATA)1ATx. Px=A\hat c=A(A^TA)^{-1}A^Tx.

Thus

P=A(ATA)1AT. P=A(A^TA)^{-1}A^T.

38.13 Oblique Projections

A projection need not be orthogonal.

Suppose

V=UW. V=U\oplus W.

The projection onto UU along WW sends

u+w u+w

to

u. u.

If W=UW=U^\perp, the projection is orthogonal. If WW is another complement, the projection is oblique.

An oblique projection still satisfies

P2=P. P^2=P.

It still has

im(P)=U,ker(P)=W. \operatorname{im}(P)=U, \qquad \ker(P)=W.

But the removed part ww may not be perpendicular to UU. Oblique projections are therefore algebraically valid projections, but they do not describe nearest-point projection in the usual Euclidean metric.

38.14 Example of an Oblique Projection

In R2\mathbb{R}^2, let

U=span{[10]} U=\operatorname{span} \left\{ \begin{bmatrix} 1\\ 0 \end{bmatrix} \right\}

be the xx-axis, and let

W=span{[11]}. W=\operatorname{span} \left\{ \begin{bmatrix} 1\\ 1 \end{bmatrix} \right\}.

Every vector

[xy] \begin{bmatrix} x\\ y \end{bmatrix}

can be written uniquely as

[xy]=[a0]+t[11]. \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} a\\ 0 \end{bmatrix} + t \begin{bmatrix} 1\\ 1 \end{bmatrix}.

From the second coordinate,

t=y. t=y.

From the first coordinate,

a+t=x, a+t=x,

so

a=xy. a=x-y.

Thus the projection onto UU along WW is

P[xy]=[xy0]. P \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} x-y\\ 0 \end{bmatrix}.

Its matrix is

P=[1100]. P= \begin{bmatrix} 1 & -1\\ 0 & 0 \end{bmatrix}.

Check idempotence:

P2=[1100][1100]=[1100]=P. P^2= \begin{bmatrix} 1 & -1\\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & -1\\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & -1\\ 0 & 0 \end{bmatrix} =P.

This projection is not orthogonal because

PTP. P^T\neq P.

It projects onto the xx-axis along diagonal lines parallel to (1,1)(1,1), not along vertical lines.

38.15 Eigenvalues of a Projection

Let PP be a projection. If vv is an eigenvector with eigenvalue λ\lambda, then

P(v)=λv. P(v)=\lambda v.

Apply PP again:

P2(v)=P(λv)=λP(v)=λ2v. P^2(v)=P(\lambda v)=\lambda P(v)=\lambda^2v.

But P2=PP^2=P, so

P2(v)=P(v)=λv. P^2(v)=P(v)=\lambda v.

Hence

λ2v=λv. \lambda^2v=\lambda v.

Since v0v\neq 0,

λ2=λ. \lambda^2=\lambda.

Thus

λ(λ1)=0. \lambda(\lambda-1)=0.

Therefore every eigenvalue of a projection is either

0 0

or

1. 1.

Vectors in the kernel have eigenvalue 00. Vectors in the image have eigenvalue 11. Projection matrices have only the eigenvalues 00 and 11.

38.16 Diagonal Form

Because the minimal polynomial of a projection divides

x2x=x(x1), x^2-x=x(x-1),

and this polynomial has distinct roots, every projection on a finite-dimensional vector space is diagonalizable.

More concretely, choose a basis

(u1,,ur) (u_1,\ldots,u_r)

for im(P)\operatorname{im}(P), and choose a basis

(w1,,ws) (w_1,\ldots,w_s)

for ker(P)\ker(P).

Since

V=im(P)ker(P), V=\operatorname{im}(P)\oplus\ker(P),

the combined list

(u1,,ur,w1,,ws) (u_1,\ldots,u_r,w_1,\ldots,w_s)

is a basis of VV.

In this basis, PP has matrix

[Ir000]. \begin{bmatrix} I_r & 0\\ 0 & 0 \end{bmatrix}.

Thus a projection is structurally simple: it is identity on one subspace and zero on a complementary subspace.

38.17 Trace and Rank

For a finite-dimensional projection PP, the trace equals the rank.

In the basis adapted to

V=im(P)ker(P), V=\operatorname{im}(P)\oplus\ker(P),

the matrix of PP is

[Ir000], \begin{bmatrix} I_r & 0\\ 0 & 0 \end{bmatrix},

where

r=dim(im(P)). r=\dim(\operatorname{im}(P)).

The trace is the sum of diagonal entries:

tr(P)=r. \operatorname{tr}(P)=r.

The rank is also

rank(P)=r. \operatorname{rank}(P)=r.

Therefore

tr(P)=rank(P). \operatorname{tr}(P)=\operatorname{rank}(P).

This fact is often useful in matrix analysis, statistics, and numerical linear algebra.

38.18 Products of Projections

The product of two projections is not necessarily a projection.

Let PP and QQ be projections. The product PQPQ is a projection if

(PQ)2=PQ. (PQ)^2=PQ.

Compute:

(PQ)2=PQPQ. (PQ)^2=PQPQ.

If PP and QQ commute, meaning

PQ=QP, PQ=QP,

then

(PQ)2=PQPQ=PPQQ=P2Q2=PQ. (PQ)^2=PQPQ=PPQQ=P^2Q^2=PQ.

Thus, if two projections commute, their product is also a projection.

Without commutativity, the product may fail to be idempotent. Products of projections therefore require care.

38.19 Projections and Least Squares

Projection operators are the algebraic core of least squares.

Given an inconsistent system

Ax=b, Ax=b,

there may be no exact solution. Instead, least squares seeks a vector x^\hat x such that

Ax^ A\hat x

is as close as possible to bb inside the column space of AA.

This means

Ax^=Pcol(A)b, A\hat x=P_{\operatorname{col}(A)}b,

where

Pcol(A) P_{\operatorname{col}(A)}

is the orthogonal projection onto the column space of AA.

The normal equations

ATAx^=ATb A^TA\hat x=A^Tb

come from the orthogonality condition

bAx^col(A). b-A\hat x\perp\operatorname{col}(A).

Thus least squares is not merely an approximation trick. It is an orthogonal projection problem.

38.20 Summary

A projection operator is a linear operator

P:VV P:V\to V

satisfying

P2=P. P^2=P.

This means applying PP twice is the same as applying it once.

Every projection decomposes the vector space as

V=im(P)ker(P). V=\operatorname{im}(P)\oplus\ker(P).

It acts as the identity on its image and as zero on its kernel.

If V=UWV=U\oplus W, then the projection onto UU along WW is the map

u+wu. u+w\mapsto u.

Orthogonal projections occur in inner product spaces when the complement is perpendicular to the target subspace. In real coordinates, an orthogonal projection matrix satisfies

P2=P=PT. P^2=P=P^T.

Projection matrices have eigenvalues only 00 and 11, are diagonalizable, and satisfy

tr(P)=rank(P). \operatorname{tr}(P)=\operatorname{rank}(P).

Projection operators are central in geometry, least squares, regression, numerical linear algebra, and the study of decompositions of vector spaces.