# Chapter 47. Orthogonality

# Chapter 47. Orthogonality

Orthogonality is the inner product version of perpendicularity. In Euclidean geometry, two nonzero vectors are perpendicular when they meet at a right angle. In an inner product space, the same idea is expressed algebraically: two vectors are orthogonal when their inner product is zero. This definition extends the geometry of right angles from the plane and three-dimensional space to arbitrary finite-dimensional and infinite-dimensional vector spaces.

## 47.1 Orthogonal Vectors

Let \(V\) be an inner product space. Two vectors \(u,v \in V\) are orthogonal if

$$
\langle u,v\rangle = 0.
$$

We write

$$
u \perp v.
$$

In \(\mathbb{R}^n\) with the standard dot product, this means

$$
u^T v = 0.
$$

For example,

$$
u =
\begin{bmatrix}
1 \\
2
\end{bmatrix},
\qquad
v =
\begin{bmatrix}
2 \\
-1
\end{bmatrix}.
$$

Then

$$
u^T v = 1\cdot 2 + 2\cdot (-1) = 0.
$$

Hence \(u\) and \(v\) are orthogonal.

Orthogonality is symmetric in real inner product spaces. If \(u \perp v\), then \(v \perp u\). In complex inner product spaces, conjugate symmetry gives

$$
\langle v,u\rangle = \overline{\langle u,v\rangle}.
$$

Thus \(\langle u,v\rangle = 0\) still implies \(\langle v,u\rangle = 0\).

## 47.2 The Zero Vector

The zero vector is orthogonal to every vector.

Indeed, for any \(v \in V\),

$$
\langle 0,v\rangle = 0.
$$

Thus

$$
0 \perp v
$$

for every \(v\).

This fact is sometimes useful, but it must be interpreted carefully. In geometry, perpendicularity usually refers to nonzero directions. In inner product spaces, orthogonality is defined algebraically, so the zero vector is orthogonal to all vectors.

## 47.3 Orthogonal Sets

A set of vectors

$$
\{v_1,v_2,\ldots,v_k\}
$$

is orthogonal if every pair of distinct vectors is orthogonal:

$$
\langle v_i,v_j\rangle = 0
\quad
\text{whenever}
\quad
i \ne j.
$$

The set is orthonormal if it is orthogonal and each vector has norm one:

$$
\langle v_i,v_j\rangle =
\begin{cases}
1, & i=j, \\
0, & i\ne j.
\end{cases}
$$

This condition is often written using the Kronecker delta:

$$
\langle v_i,v_j\rangle = \delta_{ij}.
$$

Orthogonal sets separate directions. Orthonormal sets do more: they separate directions and normalize scale.

## 47.4 Orthogonal Sets Are Linearly Independent

A nonzero orthogonal set is linearly independent.

Let

$$
\{v_1,v_2,\ldots,v_k\}
$$

be an orthogonal set, and suppose each \(v_i\) is nonzero. Assume

$$
c_1v_1+c_2v_2+\cdots+c_kv_k=0.
$$

Take the inner product with \(v_j\). Then

$$
\left\langle
c_1v_1+c_2v_2+\cdots+c_kv_k,
v_j
\right\rangle =
\langle 0,v_j\rangle.
$$

By linearity,

$$
c_1\langle v_1,v_j\rangle
+
c_2\langle v_2,v_j\rangle
+
\cdots
+
c_k\langle v_k,v_j\rangle =
0.
$$

All terms vanish except the \(j\)-th term. Hence

$$
c_j\langle v_j,v_j\rangle = 0.
$$

Since \(v_j \ne 0\),

$$
\langle v_j,v_j\rangle > 0.
$$

Therefore

$$
c_j = 0.
$$

This holds for every \(j\), so all coefficients are zero. The set is linearly independent.

## 47.5 Orthogonal Bases

An orthogonal basis is a basis whose vectors are pairwise orthogonal. An orthonormal basis is a basis whose vectors are pairwise orthogonal and have norm one.

If

$$
B=(v_1,\ldots,v_n)
$$

is an orthogonal basis for \(V\), then every vector \(x \in V\) has a unique expansion

$$
x = c_1v_1+\cdots+c_nv_n.
$$

The coefficients are easy to compute. Taking the inner product with \(v_j\) gives

$$
\langle x,v_j\rangle =
c_j\langle v_j,v_j\rangle.
$$

Thus

$$
c_j =
\frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle}.
$$

Therefore

$$
x =
\sum_{j=1}^n
\frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle}
v_j.
$$

If the basis is orthonormal, then \(\langle v_j,v_j\rangle=1\), so the formula becomes

$$
x =
\sum_{j=1}^n
\langle x,v_j\rangle v_j.
$$

This is one of the main advantages of orthonormal bases. Coordinates are obtained directly by inner products.

## 47.6 Orthogonal Subspaces

Let \(U\) and \(W\) be subspaces of an inner product space \(V\). The subspaces \(U\) and \(W\) are orthogonal if every vector in \(U\) is orthogonal to every vector in \(W\):

$$
\langle u,w\rangle = 0
\quad
\text{for all } u\in U,\ w\in W.
$$

We write

$$
U \perp W.
$$

For example, in \(\mathbb{R}^3\), the \(x\)-axis and the \(yz\)-plane are orthogonal. Every vector on the \(x\)-axis has the form

$$
(a,0,0),
$$

and every vector in the \(yz\)-plane has the form

$$
(0,b,c).
$$

Their dot product is

$$
(a,0,0)\cdot(0,b,c)=0.
$$

Thus the two subspaces are orthogonal.

## 47.7 Orthogonal Complement

Let \(S\) be a subset of an inner product space \(V\). The orthogonal complement of \(S\), denoted \(S^\perp\), is the set of all vectors in \(V\) that are orthogonal to every vector in \(S\):

$$
S^\perp =
\{x\in V : \langle x,s\rangle=0 \text{ for all } s\in S\}.
$$

If \(S\) is a subspace, then \(S^\perp\) consists of all vectors perpendicular to the entire subspace.

For example, in \(\mathbb{R}^3\), let

$$
S = \operatorname{span}\left\{
\begin{bmatrix}
1 \\
0 \\
0
\end{bmatrix}
\right\}.
$$

Then \(S\) is the \(x\)-axis, and

$$
S^\perp =
\left\{
\begin{bmatrix}
0 \\
y \\
z
\end{bmatrix}
: y,z\in\mathbb{R}
\right\}.
$$

Thus \(S^\perp\) is the \(yz\)-plane.

## 47.8 The Orthogonal Complement Is a Subspace

For any subset \(S\subseteq V\), the orthogonal complement \(S^\perp\) is a subspace of \(V\).

First, \(0\in S^\perp\), since

$$
\langle 0,s\rangle = 0
$$

for every \(s\in S\).

Now let \(x,y\in S^\perp\), and let \(a,b\) be scalars. For every \(s\in S\),

$$
\langle ax+by,s\rangle =
a\langle x,s\rangle
+
b\langle y,s\rangle =
a\cdot 0+b\cdot 0 =
0.
$$

Therefore

$$
ax+by\in S^\perp.
$$

So \(S^\perp\) is closed under linear combinations, and hence it is a subspace.

This is important because the orthogonal complement of even an arbitrary set is automatically linear.

## 47.9 Orthogonal Complement of a Span

The orthogonal complement of a set is the same as the orthogonal complement of its span:

$$
S^\perp = \operatorname{span}(S)^\perp.
$$

Indeed, if a vector is orthogonal to every vector in \(S\), then it is orthogonal to every linear combination of vectors in \(S\). Conversely, since \(S\subseteq \operatorname{span}(S)\), any vector orthogonal to the span is orthogonal to \(S\).

This fact allows one to compute orthogonal complements using a spanning set rather than every vector in a subspace.

For example, if

$$
W=\operatorname{span}\{w_1,w_2,w_3\},
$$

then

$$
x\in W^\perp
$$

if and only if

$$
\langle x,w_1\rangle =
\langle x,w_2\rangle =
\langle x,w_3\rangle =
0.
$$

Thus finding \(W^\perp\) becomes a system of homogeneous linear equations.

## 47.10 Dimension Formula

If \(W\) is a subspace of a finite-dimensional inner product space \(V\), then

$$
\dim W + \dim W^\perp = \dim V.
$$

Also,

$$
W \cap W^\perp = \{0\}.
$$

The intersection is trivial because if \(x\in W\cap W^\perp\), then \(x\in W\) and \(x\) is orthogonal to every vector in \(W\). In particular, \(x\) is orthogonal to itself:

$$
\langle x,x\rangle = 0.
$$

Positive definiteness gives

$$
x=0.
$$

In finite dimensions, these facts imply that every vector in \(V\) can be written uniquely as a sum of a vector in \(W\) and a vector in \(W^\perp\):

$$
V = W \oplus W^\perp.
$$

## 47.11 Orthogonal Decomposition

The equation

$$
V = W \oplus W^\perp
$$

means that each vector \(x\in V\) has a unique decomposition

$$
x = w + z,
$$

where

$$
w\in W,
\qquad
z\in W^\perp.
$$

The vector \(w\) is the component of \(x\) inside \(W\). The vector \(z\) is the component of \(x\) perpendicular to \(W\).

This decomposition is central to projection and approximation. It separates a vector into an explained part and a residual part.

In finite-dimensional Euclidean space, this is the familiar operation of dropping a perpendicular from a point to a line, plane, or higher-dimensional subspace.

## 47.12 Projection onto a One-Dimensional Subspace

Let \(v\ne 0\), and let

$$
W=\operatorname{span}\{v\}.
$$

The orthogonal projection of \(x\) onto \(W\) is

$$
\operatorname{proj}_W(x) =
\frac{\langle x,v\rangle}{\langle v,v\rangle}v.
$$

The residual vector is

$$
r =
x-\operatorname{proj}_W(x).
$$

This residual is orthogonal to \(v\). Indeed,

$$
\left\langle
x-\frac{\langle x,v\rangle}{\langle v,v\rangle}v,
v
\right\rangle =
\langle x,v\rangle -
\frac{\langle x,v\rangle}{\langle v,v\rangle}
\langle v,v\rangle =
0.
$$

Thus

$$
x =
\operatorname{proj}_W(x) + r
$$

is an orthogonal decomposition.

## 47.13 Projection onto an Orthogonal Basis

Suppose

$$
W=\operatorname{span}\{v_1,\ldots,v_k\},
$$

where \(v_1,\ldots,v_k\) are nonzero orthogonal vectors. Then the projection of \(x\) onto \(W\) is

$$
\operatorname{proj}_W(x) =
\sum_{j=1}^k
\frac{\langle x,v_j\rangle}{\langle v_j,v_j\rangle}v_j.
$$

If the vectors are orthonormal, this simplifies to

$$
\operatorname{proj}_W(x) =
\sum_{j=1}^k
\langle x,v_j\rangle v_j.
$$

The residual

$$
x-\operatorname{proj}_W(x)
$$

lies in \(W^\perp\).

This formula is computationally simple because each basis direction can be handled independently.

## 47.14 Projection Matrices

A projection matrix is a square matrix \(P\) satisfying

$$
P^2=P.
$$

This equation says that applying the projection twice has the same effect as applying it once. A real projection matrix is an orthogonal projection matrix when, in addition,

$$
P^T=P.
$$

Equivalently, an orthogonal projection matrix satisfies

$$
P^2=P
\quad
\text{and}
\quad
P^T=P.
$$

For complex matrices, the transpose is replaced by the conjugate transpose:

$$
P^2=P
\quad
\text{and}
\quad
P^*=P.
$$

Projection matrices formalize the geometric idea of projecting vectors onto subspaces. Orthogonal projection matrices are self-adjoint idempotent operators.

## 47.15 Projection onto a Column Space

Let \(A\) be an \(m\times n\) matrix with linearly independent columns. The column space of \(A\) is

$$
\operatorname{Col}(A) =
\{Ax : x\in\mathbb{R}^n\}.
$$

The orthogonal projection of \(b\in\mathbb{R}^m\) onto \(\operatorname{Col}(A)\) has the form

$$
p = A\hat{x}.
$$

The residual

$$
r=b-A\hat{x}
$$

must be orthogonal to every column of \(A\). This condition is

$$
A^T(b-A\hat{x})=0.
$$

Therefore

$$
A^T A\hat{x}=A^T b.
$$

These are the normal equations.

If \(A^TA\) is invertible, then

$$
\hat{x}=(A^TA)^{-1}A^T b.
$$

Thus the projection is

$$
p=A(A^TA)^{-1}A^T b.
$$

The projection matrix onto \(\operatorname{Col}(A)\) is

$$
P=A(A^TA)^{-1}A^T.
$$

This formula is fundamental in least squares.

## 47.16 Orthogonality and Least Squares

A system

$$
Ax=b
$$

may have no exact solution when \(b\) is not in the column space of \(A\). In that case, one seeks an approximate solution \(\hat{x}\) such that

$$
A\hat{x}
$$

is as close as possible to \(b\).

The error is

$$
r=b-A\hat{x}.
$$

The least squares principle chooses \(\hat{x}\) so that

$$
\|b-A\hat{x}\|_2
$$

is minimized.

The geometric condition for the minimum is

$$
r \perp \operatorname{Col}(A).
$$

That is,

$$
A^T r = 0.
$$

This again gives

$$
A^T A\hat{x}=A^T b.
$$

Thus least squares is an orthogonality problem. The best approximation is obtained when the residual is perpendicular to the approximation space.

## 47.17 Orthogonal Decomposition of Row, Column, and Null Spaces

For a real \(m\times n\) matrix \(A\), the four fundamental subspaces are related by orthogonality.

The null space of \(A\) is orthogonal to the row space of \(A\):

$$
\operatorname{Null}(A) =
\operatorname{Row}(A)^\perp.
$$

Indeed, \(Ax=0\) means every row of \(A\) has dot product zero with \(x\).

Similarly, the null space of \(A^T\), also called the left null space of \(A\), is orthogonal to the column space of \(A\):

$$
\operatorname{Null}(A^T) =
\operatorname{Col}(A)^\perp.
$$

These relationships are a central part of the fundamental theorem of linear algebra.

They explain how solutions, constraints, residuals, and images fit together geometrically.

## 47.18 Pythagorean Theorem

If \(u\perp v\), then

$$
\|u+v\|^2 =
\|u\|^2+\|v\|^2.
$$

Proof:

$$
\|u+v\|^2 =
\langle u+v,u+v\rangle.
$$

By linearity and symmetry,

$$
\langle u+v,u+v\rangle =
\langle u,u\rangle
+
2\langle u,v\rangle
+
\langle v,v\rangle.
$$

Since \(u\perp v\),

$$
\langle u,v\rangle=0.
$$

Therefore

$$
\|u+v\|^2 =
\|u\|^2+\|v\|^2.
$$

More generally, if \(v_1,\ldots,v_k\) are pairwise orthogonal, then

$$
\left\|
\sum_{j=1}^k v_j
\right\|^2 =
\sum_{j=1}^k \|v_j\|^2.
$$

## 47.19 Orthogonality in Function Spaces

Orthogonality is not limited to coordinate vectors.

Let \(V\) be a space of real-valued functions on \([a,b]\) with inner product

$$
\langle f,g\rangle =
\int_a^b f(x)g(x)\,dx.
$$

Then \(f\) and \(g\) are orthogonal if

$$
\int_a^b f(x)g(x)\,dx=0.
$$

For example, on \([-\pi,\pi]\),

$$
\sin x \perp \cos x
$$

because

$$
\int_{-\pi}^{\pi} \sin x\cos x\,dx = 0.
$$

Orthogonal functions are central in Fourier series, approximation theory, differential equations, and signal processing.

In this setting, projection becomes approximation by functions. A function can be projected onto a subspace spanned by simpler functions, such as polynomials or trigonometric functions.

## 47.20 Summary

Orthogonality generalizes perpendicularity to inner product spaces. Two vectors are orthogonal when their inner product is zero. Orthogonal sets of nonzero vectors are linearly independent. Orthogonal and orthonormal bases give simple coordinate formulas.

The orthogonal complement \(S^\perp\) is the set of all vectors orthogonal to a set \(S\). It is always a subspace, and in finite-dimensional inner product spaces a subspace \(W\) satisfies

$$
V = W \oplus W^\perp.
$$

This decomposition leads directly to projection. Projection gives the closest vector in a subspace, and the residual is orthogonal to that subspace. This principle underlies least squares, approximation, Fourier methods, and much of numerical linear algebra.