An orthogonal projection is the operation of replacing a vector by its closest vector in a chosen subspace. It is the precise linear algebra version of dropping a perpendicular from a point to a line, plane, or higher-dimensional subspace.
If W is a subspace of an inner product space V, then every vector v can often be decomposed into two parts:
v=w+r,
where
w∈W,r∈W⊥.
The vector w is the orthogonal projection of v onto W. The vector r is the residual. In finite-dimensional inner product spaces, this decomposition exists and is unique for every subspace W. The projected vector is the closest vector in W to the original vector.
51.1 Projection onto a Line
Let u be a nonzero vector in an inner product space V. The line generated by u is
L=span{u}.
The projection of v onto L is the vector in the direction of u closest to v. It has the form
p=cu
for some scalar c.
The residual is
r=v−cu.
For cu to be the orthogonal projection, the residual must be orthogonal to the line. Since the line is spanned by u, it is enough to require
⟨v−cu,u⟩=0.
Using linearity,
⟨v,u⟩−c⟨u,u⟩=0.
Therefore
c=⟨u,u⟩⟨v,u⟩.
So the projection is
proju(v)=⟨u,u⟩⟨v,u⟩u.
This formula is valid whenever u=0.
51.2 Projection onto a Unit Vector
If q is a unit vector, then
⟨q,q⟩=1.
The projection formula simplifies to
projq(v)=⟨v,q⟩q.
This is the simplest projection formula. The scalar
⟨v,q⟩
is the coordinate of v in the direction q. The vector
⟨v,q⟩q
is the component of v along q.
For example, let
v=[34],q=[10].
Then
⟨v,q⟩=3,
so
projq(v)=3[10]=[30].
The projection keeps the horizontal component and removes the vertical component.
51.3 The Residual
The residual of v after projection onto a subspace W is
r=v−projW(v).
The defining property of orthogonal projection is
r∈W⊥.
Equivalently,
⟨r,w⟩=0
for every w∈W.
Thus projection separates a vector into an explained part and an unexplained part:
v=projW(v)+r,
where
projW(v)∈W,r∈W⊥.
This is the orthogonal decomposition of v with respect to W.
51.4 Projection onto an Orthonormal Basis
Let W be a subspace with orthonormal basis
q1,q2,…,qk.
The projection of v onto W is
projW(v)=j=1∑k⟨v,qj⟩qj.
This formula follows from the coordinate formula for orthonormal bases. The projected vector lies in W, and the residual is orthogonal to every qj.
Indeed, let
p=j=1∑k⟨v,qj⟩qj.
Then for each i,
⟨v−p,qi⟩=⟨v,qi⟩−⟨j=1∑k⟨v,qj⟩qj,qi⟩.
Using orthonormality,
⟨j=1∑k⟨v,qj⟩qj,qi⟩=⟨v,qi⟩.
Hence
⟨v−p,qi⟩=0.
Since the qi span W, the residual is orthogonal to all of W.
51.5 Projection Matrix for an Orthonormal Basis
Let Q be the matrix whose columns are the orthonormal vectors
q1,…,qk.
Then
QTQ=Ik.
The projection of v onto Col(Q) is
p=QQTv.
Thus the projection matrix is
P=QQT.
This formula is important because it expresses projection as matrix multiplication.
The matrix P satisfies
P2=P.
Indeed,
P2=(QQT)(QQT)=Q(QTQ)QT=QIQT=QQT=P.
It also satisfies
PT=P.
Therefore P is symmetric and idempotent. A real matrix with these two properties is an orthogonal projection matrix.
51.6 Idempotence
A projection is idempotent. This means that applying it twice gives the same result as applying it once:
P2=P.
The reason is geometric. Once a vector has been projected onto a subspace, projecting it onto the same subspace again changes nothing.
If
p=Pv
and p∈W, then
Pp=p.
Therefore
P(Pv)=Pv.
In matrix form,
P2v=Pv
for every vector v, so
P2=P.
Idempotence is the algebraic signature of projection. General projections are idempotent linear maps, while orthogonal projections also respect the inner product geometry.
51.7 Symmetry
A real projection matrix P is an orthogonal projection matrix precisely when it is both idempotent and symmetric:
P2=P,PT=P.
The idempotent condition says that P is a projection. The symmetry condition says that the projection is orthogonal rather than oblique.
For complex matrices, symmetry is replaced by self-adjointness:
P2=P,P∗=P.
Here P∗ denotes the conjugate transpose.
Orthogonal projection matrices preserve the perpendicular relationship between the range and the residual. If p=Pv, then
v−p∈Range(P)⊥.
This is the geometric content of the symmetry condition.
51.8 Projection onto a Column Space
Let A be an m×n real matrix with linearly independent columns. We want the projection of b∈Rm onto
Col(A).
The projected vector has the form
p=Ax^
for some x^∈Rn.
The residual is
r=b−Ax^.
For p to be the orthogonal projection, the residual must be orthogonal to every column of A. This condition is
AT(b−Ax^)=0.
Rearranging gives the normal equations:
ATAx^=ATb.
Since the columns of A are linearly independent, ATA is invertible. Thus
x^=(ATA)−1ATb.
Therefore
p=A(ATA)−1ATb.
The projection matrix onto Col(A) is
P=A(ATA)−1AT.
51.9 Why ATA Is Invertible
Assume the columns of A are linearly independent. Then ATA is invertible.
To see this, suppose
ATAx=0.
Multiply on the left by xT:
xTATAx=0.
But
xTATAx=(Ax)T(Ax)=∥Ax∥2.
Hence
∥Ax∥2=0.
Therefore
Ax=0.
Since the columns of A are linearly independent, the null space of A is trivial. Thus
x=0.
So the null space of ATA is trivial, and ATA is invertible.
51.10 Projection Matrix onto a Column Space
For a full-column-rank matrix A, the projection matrix
P=A(ATA)−1AT
has two key properties.
First,
P2=P.
Indeed,
P2=A(ATA)−1ATA(ATA)−1AT.
Since
(ATA)−1ATA(ATA)−1=(ATA)−1,
we get
P2=A(ATA)−1AT=P.
Second,
PT=P.
This follows because ATA is symmetric, so its inverse is symmetric:
PT=(A(ATA)−1AT)T=A(ATA)−1AT=P.
Thus P is an orthogonal projection matrix.
51.11 Closest Vector Property
Orthogonal projection gives the closest vector in a subspace.
Let W be a finite-dimensional subspace of an inner product space V. Let
p=projW(v),r=v−p.
Then
p∈W,r∈W⊥.
For any other vector w∈W,
v−w=(v−p)+(p−w).
Here
v−p∈W⊥,
and
p−w∈W.
Therefore the two vectors v−p and p−w are orthogonal. By the Pythagorean theorem,
∥v−w∥2=∥v−p∥2+∥p−w∥2.
Since
∥p−w∥2≥0,
we have
∥v−w∥2≥∥v−p∥2.
Thus
∥v−w∥≥∥v−p∥.
So p is the closest vector in W to v. Equality occurs only when w=p. This is the best approximation property of orthogonal projection.
51.12 Distance to a Subspace
The distance from v to a subspace W is
dist(v,W)=w∈Winf∥v−w∥.
When W is finite-dimensional, this infimum is attained by the orthogonal projection:
dist(v,W)=∥v−projW(v)∥.
If p=projW(v), then
dist(v,W)=∥v−p∥.
The residual is therefore the shortest error vector. It measures exactly how far v lies from the subspace.
51.13 Least Squares
Orthogonal projection is the geometric core of least squares.
Consider an inconsistent system
Ax=b.
If b∈/Col(A), there is no exact solution. Instead, we seek x^ such that
Ax^
is as close as possible to b. This means minimizing
∥b−Ax∥2.
The closest vector in Col(A) is the orthogonal projection of b onto Col(A). Therefore
Ax^=projCol(A)(b).
The residual
r=b−Ax^
must be orthogonal to Col(A). Hence
ATr=0.
Substituting r=b−Ax^ gives
AT(b−Ax^)=0,
or
ATAx^=ATb.
These are the normal equations.
51.14 Example: Projection onto a Line in R2
Let
u=[12],v=[31].
The projection of v onto span{u} is
p=uTuvTuu.
Compute
vTu=3⋅1+1⋅2=5,
and
uTu=12+22=5.
Thus
p=55[12]=[12].
The residual is
r=v−p=[31]−[12]=[2−1].
Check orthogonality:
rTu=2⋅1+(−1)⋅2=0.
Thus the decomposition is
[31]=[12]+[2−1],
with the first vector on the line and the second vector perpendicular to the line.
51.15 Example: Projection onto a Plane
Let W⊆R3 be the xy-plane:
W=⎩⎨⎧xy0:x,y∈R⎭⎬⎫.
For
v=abc,
the projection onto W is
p=ab0.
The residual is
r=00c.
The projection matrix is
P=100010000.
Then
Pv=ab0.
This matrix satisfies
P2=P,PT=P.
So it is an orthogonal projection matrix.
51.16 Example: Projection Using a Matrix
Let
A=110.
The column space of A is the line in R3 spanned by
u=110.
The projection matrix is
P=A(ATA)−1AT.
Compute
ATA=[110]110=2.
Thus
P=21110[110]=21110110000.
For
b=245,
the projection is
p=Pb=21110110000245=330.
The residual is
r=b−p=−115.
Check orthogonality to the column of A:
ATr=[110]−115=0.
Thus p is the orthogonal projection of b onto Col(A).
51.17 Orthogonal Projection and Coordinates
If W has an orthonormal basis q1,…,qk, then the projection coefficients are
cj=⟨v,qj⟩.
Thus
projW(v)=c1q1+⋯+ckqk.
These coefficients are the coordinates of the projected vector in the orthonormal basis of W.
The projection discards all components of v in W⊥. If V=W⊕W⊥, and
v=w+z,w∈W,z∈W⊥,
then
projW(v)=w.
Projection is therefore a coordinate-selection operation relative to an orthogonal decomposition.
51.18 Orthogonal Projection and Energy
Let p=projW(v) and r=v−p. Since
p⊥r,
the Pythagorean theorem gives
∥v∥2=∥p∥2+∥r∥2.
Thus projection splits the squared norm into two parts:
Term
Meaning
∥p∥2
Energy captured by the subspace
∥r∥2
Energy left outside the subspace
If W has orthonormal basis q1,…,qk, then
∥p∥2=j=1∑k∣⟨v,qj⟩∣2.
Therefore
∥r∥2=∥v∥2−j=1∑k∣⟨v,qj⟩∣2.
This form appears in approximation theory, signal processing, Fourier analysis, statistics, and numerical linear algebra.
51.19 Oblique Projections
A projection need not be orthogonal.
A linear map P:V→V is a projection if
P2=P.
This only means that applying P twice is the same as applying it once. It does not require the residual to be orthogonal to the range.
An oblique projection is a projection whose range and null space are complementary but not orthogonal.
For example,
P=[1010]
satisfies
P2=P,
so it is a projection. But
PT=P,
so it is not an orthogonal projection.
Orthogonal projections are usually preferred when distance minimization matters. Oblique projections appear in other settings where the decomposition directions are prescribed by constraints rather than perpendicularity.
51.20 Projection Theorem
In finite-dimensional inner product spaces, every subspace W has an orthogonal projection. For each v∈V, there is a unique vector p∈W such that
v−p∈W⊥.
This vector p is the unique closest vector in W to v.
In Hilbert spaces, the analogous result requires W to be closed. If M is a closed subspace of a Hilbert space H, then every x∈H has a unique best approximation x^∈M, and the error x−x^ lies in M⊥.
This result is called the projection theorem. It is the abstract form of the closest vector property.
51.21 Summary
Orthogonal projection decomposes a vector into a component inside a subspace and a residual perpendicular to that subspace:
v=projW(v)+r,r∈W⊥.
For a line spanned by a nonzero vector u,
proju(v)=⟨u,u⟩⟨v,u⟩u.
For a subspace with orthonormal basis q1,…,qk,
projW(v)=j=1∑k⟨v,qj⟩qj.
For a full-column-rank matrix A, the projection matrix onto Col(A) is
P=A(ATA)−1AT.
Orthogonal projection gives the closest vector in a subspace, produces the residual used in least squares, and gives the geometric meaning of many matrix formulas.
← → section · ↑ ↓ slide · Space next · F fullscreen · Esc exit