Skip to content

Chapter 47. Orthogonality

Orthogonality is the inner product version of perpendicularity. In Euclidean geometry, two nonzero vectors are perpendicular when they meet at a right angle. In an inner product space, the same idea is expressed algebraically: two vectors are orthogonal when their inner product is zero. This definition extends the geometry of right angles from the plane and three-dimensional space to arbitrary finite-dimensional and infinite-dimensional vector spaces.

47.1 Orthogonal Vectors

Let VV be an inner product space. Two vectors u,vVu,v \in V are orthogonal if

u,v=0. \langle u,v\rangle = 0.

We write

uv. u \perp v.

In Rn\mathbb{R}^n with the standard dot product, this means

uTv=0. u^T v = 0.

For example,

u=[12],v=[21]. u = \begin{bmatrix} 1 \\ 2 \end{bmatrix}, \qquad v = \begin{bmatrix} 2 \\ -1 \end{bmatrix}.

Then

uTv=12+2(1)=0. u^T v = 1\cdot 2 + 2\cdot (-1) = 0.

Hence uu and vv are orthogonal.

Orthogonality is symmetric in real inner product spaces. If uvu \perp v, then vuv \perp u. In complex inner product spaces, conjugate symmetry gives

v,u=u,v. \langle v,u\rangle = \overline{\langle u,v\rangle}.

Thus u,v=0\langle u,v\rangle = 0 still implies v,u=0\langle v,u\rangle = 0.

47.2 The Zero Vector

The zero vector is orthogonal to every vector.

Indeed, for any vVv \in V,

0,v=0. \langle 0,v\rangle = 0.

Thus

0v 0 \perp v

for every vv.

This fact is sometimes useful, but it must be interpreted carefully. In geometry, perpendicularity usually refers to nonzero directions. In inner product spaces, orthogonality is defined algebraically, so the zero vector is orthogonal to all vectors.

47.3 Orthogonal Sets

A set of vectors

{v1,v2,,vk} \{v_1,v_2,\ldots,v_k\}

is orthogonal if every pair of distinct vectors is orthogonal:

vi,vj=0wheneverij. \langle v_i,v_j\rangle = 0 \quad \text{whenever} \quad i \ne j.

The set is orthonormal if it is orthogonal and each vector has norm one:

vi,vj={1,i=j,0,ij. \langle v_i,v_j\rangle = \begin{cases} 1, & i=j, \\ 0, & i\ne j. \end{cases}

This condition is often written using the Kronecker delta:

vi,vj=δij. \langle v_i,v_j\rangle = \delta_{ij}.

Orthogonal sets separate directions. Orthonormal sets do more: they separate directions and normalize scale.

47.4 Orthogonal Sets Are Linearly Independent

A nonzero orthogonal set is linearly independent.

Let

{v1,v2,,vk} \{v_1,v_2,\ldots,v_k\}

be an orthogonal set, and suppose each viv_i is nonzero. Assume

c1v1+c2v2++ckvk=0. c_1v_1+c_2v_2+\cdots+c_kv_k=0.

Take the inner product with vjv_j. Then

c1v1+c2v2++ckvk,vj=0,vj. \left\langle c_1v_1+c_2v_2+\cdots+c_kv_k, v_j \right\rangle = \langle 0,v_j\rangle.

By linearity,

c1v1,vj+c2v2,vj++ckvk,vj=0. c_1\langle v_1,v_j\rangle + c_2\langle v_2,v_j\rangle + \cdots + c_k\langle v_k,v_j\rangle = 0.

All terms vanish except the jj-th term. Hence

cjvj,vj=0. c_j\langle v_j,v_j\rangle = 0.

Since vj0v_j \ne 0,

vj,vj>0. \langle v_j,v_j\rangle > 0.

Therefore

cj=0. c_j = 0.

This holds for every jj, so all coefficients are zero. The set is linearly independent.

47.5 Orthogonal Bases

An orthogonal basis is a basis whose vectors are pairwise orthogonal. An orthonormal basis is a basis whose vectors are pairwise orthogonal and have norm one.

If

B=(v1,,vn) B=(v_1,\ldots,v_n)

is an orthogonal basis for VV, then every vector xVx \in V has a unique expansion

x=c1v1++cnvn. x = c_1v_1+\cdots+c_nv_n.

The coefficients are easy to compute. Taking the inner product with vjv_j gives

x,vj=cjvj,vj. \langle x,v_j\rangle = c_j\langle v_j,v_j\rangle.

Thus

cj=x,vjvj,vj. c_j = \frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle}.

Therefore

x=j=1nx,vjvj,vjvj. x = \sum_{j=1}^n \frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle} v_j.

If the basis is orthonormal, then vj,vj=1\langle v_j,v_j\rangle=1, so the formula becomes

x=j=1nx,vjvj. x = \sum_{j=1}^n \langle x,v_j\rangle v_j.

This is one of the main advantages of orthonormal bases. Coordinates are obtained directly by inner products.

47.6 Orthogonal Subspaces

Let UU and WW be subspaces of an inner product space VV. The subspaces UU and WW are orthogonal if every vector in UU is orthogonal to every vector in WW:

u,w=0for all uU, wW. \langle u,w\rangle = 0 \quad \text{for all } u\in U,\ w\in W.

We write

UW. U \perp W.

For example, in R3\mathbb{R}^3, the xx-axis and the yzyz-plane are orthogonal. Every vector on the xx-axis has the form

(a,0,0), (a,0,0),

and every vector in the yzyz-plane has the form

(0,b,c). (0,b,c).

Their dot product is

(a,0,0)(0,b,c)=0. (a,0,0)\cdot(0,b,c)=0.

Thus the two subspaces are orthogonal.

47.7 Orthogonal Complement

Let SS be a subset of an inner product space VV. The orthogonal complement of SS, denoted SS^\perp, is the set of all vectors in VV that are orthogonal to every vector in SS:

S={xV:x,s=0 for all sS}. S^\perp = \{x\in V : \langle x,s\rangle=0 \text{ for all } s\in S\}.

If SS is a subspace, then SS^\perp consists of all vectors perpendicular to the entire subspace.

For example, in R3\mathbb{R}^3, let

S=span{[100]}. S = \operatorname{span}\left\{ \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \right\}.

Then SS is the xx-axis, and

$$
S^\perp =
\left{
\begin{bmatrix}
0 \
y \
z
\end{bmatrix}
y,z\in\mathbb{R} \right}. $$

Thus SS^\perp is the yzyz-plane.

47.8 The Orthogonal Complement Is a Subspace

For any subset SVS\subseteq V, the orthogonal complement SS^\perp is a subspace of VV.

First, 0S0\in S^\perp, since

0,s=0 \langle 0,s\rangle = 0

for every sSs\in S.

Now let x,ySx,y\in S^\perp, and let a,ba,b be scalars. For every sSs\in S,

ax+by,s=ax,s+by,s=a0+b0=0. \langle ax+by,s\rangle = a\langle x,s\rangle + b\langle y,s\rangle = a\cdot 0+b\cdot 0 = 0.

Therefore

ax+byS. ax+by\in S^\perp.

So SS^\perp is closed under linear combinations, and hence it is a subspace.

This is important because the orthogonal complement of even an arbitrary set is automatically linear.

47.9 Orthogonal Complement of a Span

The orthogonal complement of a set is the same as the orthogonal complement of its span:

S=span(S). S^\perp = \operatorname{span}(S)^\perp.

Indeed, if a vector is orthogonal to every vector in SS, then it is orthogonal to every linear combination of vectors in SS. Conversely, since Sspan(S)S\subseteq \operatorname{span}(S), any vector orthogonal to the span is orthogonal to SS.

This fact allows one to compute orthogonal complements using a spanning set rather than every vector in a subspace.

For example, if

W=span{w1,w2,w3}, W=\operatorname{span}\{w_1,w_2,w_3\},

then

xW x\in W^\perp

if and only if

x,w1=x,w2=x,w3=0. \langle x,w_1\rangle = \langle x,w_2\rangle = \langle x,w_3\rangle = 0.

Thus finding WW^\perp becomes a system of homogeneous linear equations.

47.10 Dimension Formula

If WW is a subspace of a finite-dimensional inner product space VV, then

dimW+dimW=dimV. \dim W + \dim W^\perp = \dim V.

Also,

WW={0}. W \cap W^\perp = \{0\}.

The intersection is trivial because if xWWx\in W\cap W^\perp, then xWx\in W and xx is orthogonal to every vector in WW. In particular, xx is orthogonal to itself:

x,x=0. \langle x,x\rangle = 0.

Positive definiteness gives

x=0. x=0.

In finite dimensions, these facts imply that every vector in VV can be written uniquely as a sum of a vector in WW and a vector in WW^\perp:

V=WW. V = W \oplus W^\perp.

47.11 Orthogonal Decomposition

The equation

V=WW V = W \oplus W^\perp

means that each vector xVx\in V has a unique decomposition

x=w+z, x = w + z,

where

wW,zW. w\in W, \qquad z\in W^\perp.

The vector ww is the component of xx inside WW. The vector zz is the component of xx perpendicular to WW.

This decomposition is central to projection and approximation. It separates a vector into an explained part and a residual part.

In finite-dimensional Euclidean space, this is the familiar operation of dropping a perpendicular from a point to a line, plane, or higher-dimensional subspace.

47.12 Projection onto a One-Dimensional Subspace

Let v0v\ne 0, and let

W=span{v}. W=\operatorname{span}\{v\}.

The orthogonal projection of xx onto WW is

projW(x)=x,vv,vv. \operatorname{proj}_W(x) = \frac{\langle x,v\rangle}{\langle v,v\rangle}v.

The residual vector is

r=xprojW(x). r = x-\operatorname{proj}_W(x).

This residual is orthogonal to vv. Indeed,

xx,vv,vv,v=x,vx,vv,vv,v=0. \left\langle x-\frac{\langle x,v\rangle}{\langle v,v\rangle}v, v \right\rangle = \langle x,v\rangle - \frac{\langle x,v\rangle}{\langle v,v\rangle} \langle v,v\rangle = 0.

Thus

x=projW(x)+r x = \operatorname{proj}_W(x) + r

is an orthogonal decomposition.

47.13 Projection onto an Orthogonal Basis

Suppose

W=span{v1,,vk}, W=\operatorname{span}\{v_1,\ldots,v_k\},

where v1,,vkv_1,\ldots,v_k are nonzero orthogonal vectors. Then the projection of xx onto WW is

projW(x)=j=1kx,vjvj,vjvj. \operatorname{proj}_W(x) = \sum_{j=1}^k \frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle}v_j.

If the vectors are orthonormal, this simplifies to

projW(x)=j=1kx,vjvj. \operatorname{proj}_W(x) = \sum_{j=1}^k \langle x,v_j\rangle v_j.

The residual

xprojW(x) x-\operatorname{proj}_W(x)

lies in WW^\perp.

This formula is computationally simple because each basis direction can be handled independently.

47.14 Projection Matrices

A projection matrix is a square matrix PP satisfying

P2=P. P^2=P.

This equation says that applying the projection twice has the same effect as applying it once. A real projection matrix is an orthogonal projection matrix when, in addition,

PT=P. P^T=P.

Equivalently, an orthogonal projection matrix satisfies

P2=PandPT=P. P^2=P \quad \text{and} \quad P^T=P.

For complex matrices, the transpose is replaced by the conjugate transpose:

P2=PandP=P. P^2=P \quad \text{and} \quad P^*=P.

Projection matrices formalize the geometric idea of projecting vectors onto subspaces. Orthogonal projection matrices are self-adjoint idempotent operators.

47.15 Projection onto a Column Space

Let AA be an m×nm\times n matrix with linearly independent columns. The column space of AA is

Col(A)={Ax:xRn}. \operatorname{Col}(A) = \{Ax : x\in\mathbb{R}^n\}.

The orthogonal projection of bRmb\in\mathbb{R}^m onto Col(A)\operatorname{Col}(A) has the form

p=Ax^. p = A\hat{x}.

The residual

r=bAx^ r=b-A\hat{x}

must be orthogonal to every column of AA. This condition is

AT(bAx^)=0. A^T(b-A\hat{x})=0.

Therefore

ATAx^=ATb. A^T A\hat{x}=A^T b.

These are the normal equations.

If ATAA^TA is invertible, then

x^=(ATA)1ATb. \hat{x}=(A^TA)^{-1}A^T b.

Thus the projection is

p=A(ATA)1ATb. p=A(A^TA)^{-1}A^T b.

The projection matrix onto Col(A)\operatorname{Col}(A) is

P=A(ATA)1AT. P=A(A^TA)^{-1}A^T.

This formula is fundamental in least squares.

47.16 Orthogonality and Least Squares

A system

Ax=b Ax=b

may have no exact solution when bb is not in the column space of AA. In that case, one seeks an approximate solution x^\hat{x} such that

Ax^ A\hat{x}

is as close as possible to bb.

The error is

r=bAx^. r=b-A\hat{x}.

The least squares principle chooses x^\hat{x} so that

bAx^2 \|b-A\hat{x}\|_2

is minimized.

The geometric condition for the minimum is

rCol(A). r \perp \operatorname{Col}(A).

That is,

ATr=0. A^T r = 0.

This again gives

ATAx^=ATb. A^T A\hat{x}=A^T b.

Thus least squares is an orthogonality problem. The best approximation is obtained when the residual is perpendicular to the approximation space.

47.17 Orthogonal Decomposition of Row, Column, and Null Spaces

For a real m×nm\times n matrix AA, the four fundamental subspaces are related by orthogonality.

The null space of AA is orthogonal to the row space of AA:

Null(A)=Row(A). \operatorname{Null}(A) = \operatorname{Row}(A)^\perp.

Indeed, Ax=0Ax=0 means every row of AA has dot product zero with xx.

Similarly, the null space of ATA^T, also called the left null space of AA, is orthogonal to the column space of AA:

Null(AT)=Col(A). \operatorname{Null}(A^T) = \operatorname{Col}(A)^\perp.

These relationships are a central part of the fundamental theorem of linear algebra.

They explain how solutions, constraints, residuals, and images fit together geometrically.

47.18 Pythagorean Theorem

If uvu\perp v, then

u+v2=u2+v2. \|u+v\|^2 = \|u\|^2+\|v\|^2.

Proof:

u+v2=u+v,u+v. \|u+v\|^2 = \langle u+v,u+v\rangle.

By linearity and symmetry,

u+v,u+v=u,u+2u,v+v,v. \langle u+v,u+v\rangle = \langle u,u\rangle + 2\langle u,v\rangle + \langle v,v\rangle.

Since uvu\perp v,

u,v=0. \langle u,v\rangle=0.

Therefore

u+v2=u2+v2. \|u+v\|^2 = \|u\|^2+\|v\|^2.

More generally, if v1,,vkv_1,\ldots,v_k are pairwise orthogonal, then

j=1kvj2=j=1kvj2. \left\| \sum_{j=1}^k v_j \right\|^2 = \sum_{j=1}^k \|v_j\|^2.

47.19 Orthogonality in Function Spaces

Orthogonality is not limited to coordinate vectors.

Let VV be a space of real-valued functions on [a,b][a,b] with inner product

f,g=abf(x)g(x)dx. \langle f,g\rangle = \int_a^b f(x)g(x)\,dx.

Then ff and gg are orthogonal if

abf(x)g(x)dx=0. \int_a^b f(x)g(x)\,dx=0.

For example, on [π,π][-\pi,\pi],

sinxcosx \sin x \perp \cos x

because

ππsinxcosxdx=0. \int_{-\pi}^{\pi} \sin x\cos x\,dx = 0.

Orthogonal functions are central in Fourier series, approximation theory, differential equations, and signal processing.

In this setting, projection becomes approximation by functions. A function can be projected onto a subspace spanned by simpler functions, such as polynomials or trigonometric functions.

47.20 Summary

Orthogonality generalizes perpendicularity to inner product spaces. Two vectors are orthogonal when their inner product is zero. Orthogonal sets of nonzero vectors are linearly independent. Orthogonal and orthonormal bases give simple coordinate formulas.

The orthogonal complement SS^\perp is the set of all vectors orthogonal to a set SS. It is always a subspace, and in finite-dimensional inner product spaces a subspace WW satisfies

V=WW. V = W \oplus W^\perp.

This decomposition leads directly to projection. Projection gives the closest vector in a subspace, and the residual is orthogonal to that subspace. This principle underlies least squares, approximation, Fourier methods, and much of numerical linear algebra.