# Chapter 49. Orthonormal Bases

# Chapter 49. Orthonormal Bases

An orthonormal basis is a basis made from unit vectors that are mutually orthogonal. It is one of the most useful structures in linear algebra because it combines two properties at once: every vector can be expressed uniquely in the basis, and the coefficients are found directly by inner products.

In an ordinary basis, coordinates may require solving a linear system. In an orthonormal basis, coordinates are obtained by taking dot products. This makes orthonormal bases central in projection, least squares, Fourier analysis, numerical linear algebra, signal processing, and spectral theory. Standard references define an orthonormal set as an orthogonal set of unit vectors, and such a set is linearly independent.

## 49.1 Unit Vectors

Let \(V\) be an inner product space. A vector \(v \in V\) is a unit vector if

$$
\|v\| = 1.
$$

Since the norm is induced by the inner product,

$$
\|v\| = \sqrt{\langle v,v\rangle}.
$$

Thus \(v\) is a unit vector exactly when

$$
\langle v,v\rangle = 1.
$$

For example, in \(\mathbb{R}^2\),

$$
e_1 =
\begin{bmatrix}
1\\
0
\end{bmatrix},
\qquad
e_2 =
\begin{bmatrix}
0\\
1
\end{bmatrix}
$$

are unit vectors.

The vector

$$
v =
\begin{bmatrix}
3\\
4
\end{bmatrix}
$$

is not a unit vector because

$$
\|v\| = \sqrt{3^2+4^2}=5.
$$

To make it a unit vector, divide by its length:

$$
\frac{v}{\|v\|} =
\frac{1}{5}
\begin{bmatrix}
3\\
4
\end{bmatrix} =
\begin{bmatrix}
3/5\\
4/5
\end{bmatrix}.
$$

This process is called normalization.

## 49.2 Normalization

If \(v\ne 0\), its normalization is

$$
\widehat{v} = \frac{v}{\|v\|}.
$$

Then

$$
\|\widehat{v}\| =
\left\|
\frac{v}{\|v\|}
\right\| =
\frac{1}{\|v\|}\|v\| =
1.
$$

Thus \(\widehat{v}\) is a unit vector in the same direction as \(v\).

Normalization changes the length of a vector but not its direction. It is the operation that converts an orthogonal basis into an orthonormal basis. If \(v_1,\ldots,v_k\) are nonzero orthogonal vectors, then

$$
\frac{v_1}{\|v_1\|},
\frac{v_2}{\|v_2\|},
\ldots,
\frac{v_k}{\|v_k\|}
$$

form an orthonormal set. This follows because scaling nonzero orthogonal vectors preserves orthogonality and gives each vector length one.

## 49.3 Orthonormal Sets

A set of vectors

$$
\{q_1,q_2,\ldots,q_k\}
$$

is orthonormal if

$$
\langle q_i,q_j\rangle =
\begin{cases}
1, & i=j,\\
0, & i\ne j.
\end{cases}
$$

Equivalently,

$$
\langle q_i,q_j\rangle = \delta_{ij},
$$

where \(\delta_{ij}\) is the Kronecker delta.

The condition contains two requirements. When \(i=j\),

$$
\langle q_i,q_i\rangle = 1,
$$

so every vector has length one. When \(i\ne j\),

$$
\langle q_i,q_j\rangle = 0,
$$

so distinct vectors are orthogonal.

For example,

$$
q_1 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
1
\end{bmatrix},
\qquad
q_2 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
-1
\end{bmatrix}
$$

form an orthonormal set in \(\mathbb{R}^2\). Indeed,

$$
\|q_1\|=1,
\qquad
\|q_2\|=1,
$$

and

$$
q_1^Tq_2 =
\frac{1}{2}(1-1) =
0.
$$

## 49.4 Orthonormal Bases

An orthonormal basis is an orthonormal set that is also a basis.

Thus \(q_1,\ldots,q_n\) form an orthonormal basis for \(V\) if:

| Property | Meaning |
|---|---|
| Spanning | Every vector in \(V\) is a linear combination of the \(q_i\) |
| Linear independence | The representation is unique |
| Orthogonality | Distinct basis vectors have inner product zero |
| Normalization | Each basis vector has length one |

In a finite-dimensional inner product space, any orthonormal set with \(\dim V\) vectors is automatically an orthonormal basis. This follows because every orthonormal set is linearly independent, and a linearly independent set with the dimension of the space is a basis.

## 49.5 Orthonormal Sets Are Linearly Independent

Every orthonormal set is linearly independent.

Let

$$
\{q_1,\ldots,q_k\}
$$

be an orthonormal set, and suppose

$$
c_1q_1+\cdots+c_kq_k=0.
$$

Take the inner product with \(q_j\). Then

$$
\left\langle
c_1q_1+\cdots+c_kq_k,
q_j
\right\rangle =
\langle 0,q_j\rangle.
$$

By linearity,

$$
c_1\langle q_1,q_j\rangle
+
\cdots
+
c_k\langle q_k,q_j\rangle =
0.
$$

All terms vanish except the \(j\)-th term. Since \(\langle q_j,q_j\rangle=1\), we get

$$
c_j=0.
$$

This holds for every \(j\). Therefore all coefficients are zero, so the set is linearly independent.

This proof is one reason orthonormal systems are algebraically clean. The inner product isolates one coordinate at a time.

## 49.6 Coordinates in an Orthonormal Basis

Let

$$
B=(q_1,\ldots,q_n)
$$

be an orthonormal basis for \(V\). Every vector \(v\in V\) has a unique representation

$$
v = c_1q_1+\cdots+c_nq_n.
$$

To find \(c_j\), take the inner product with \(q_j\):

$$
\langle v,q_j\rangle =
\left\langle
c_1q_1+\cdots+c_nq_n,
q_j
\right\rangle.
$$

Using orthonormality,

$$
\langle v,q_j\rangle = c_j.
$$

Therefore

$$
c_j = \langle v,q_j\rangle.
$$

Hence

$$
v =
\sum_{j=1}^n \langle v,q_j\rangle q_j.
$$

This is the coordinate formula for an orthonormal basis. It replaces solving a system by taking inner products. Coordinate formulas of this form are a primary advantage of orthogonal and orthonormal bases.

## 49.7 Coordinate Vector

If \(B=(q_1,\ldots,q_n)\) is an orthonormal basis, then the coordinate vector of \(v\) relative to \(B\) is

$$
[v]_B =
\begin{bmatrix}
\langle v,q_1\rangle\\
\langle v,q_2\rangle\\
\vdots\\
\langle v,q_n\rangle
\end{bmatrix}.
$$

For example, let

$$
q_1 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
1
\end{bmatrix},
\qquad
q_2 =
\frac{1}{\sqrt{2}}
\begin{bmatrix}
1\\
-1
\end{bmatrix},
$$

and let

$$
v =
\begin{bmatrix}
4\\
2
\end{bmatrix}.
$$

Then

$$
\langle v,q_1\rangle =
\frac{1}{\sqrt{2}}(4+2) =
3\sqrt{2},
$$

and

$$
\langle v,q_2\rangle =
\frac{1}{\sqrt{2}}(4-2) =
\sqrt{2}.
$$

Thus

$$
[v]_B =
\begin{bmatrix}
3\sqrt{2}\\
\sqrt{2}
\end{bmatrix}.
$$

Therefore

$$
v =
3\sqrt{2}q_1+\sqrt{2}q_2.
$$

## 49.8 Matrix Form

Let \(Q\) be the matrix whose columns are the orthonormal vectors

$$
q_1,\ldots,q_n.
$$

Then

$$
Q =
\begin{bmatrix}
| & | & & |\\
q_1 & q_2 & \cdots & q_n\\
| & | & & |
\end{bmatrix}.
$$

The condition that the columns are orthonormal is

$$
Q^TQ=I
$$

in the real case.

In the complex case, the condition is

$$
Q^*Q=I,
$$

where \(Q^*\) is the conjugate transpose.

If \(Q\) is square, then

$$
Q^{-1}=Q^T
$$

in the real case, and

$$
Q^{-1}=Q^*
$$

in the complex case.

A real square matrix with orthonormal columns is called an orthogonal matrix. A complex square matrix with orthonormal columns is called a unitary matrix.

## 49.9 Orthogonal Matrices

A real square matrix \(Q\) is orthogonal if

$$
Q^TQ=I.
$$

Since \(Q\) is square, this also implies

$$
QQ^T=I.
$$

Thus

$$
Q^{-1}=Q^T.
$$

Orthogonal matrices preserve inner products:

$$
\langle Qx,Qy\rangle =
(Qx)^T(Qy) =
x^TQ^TQy =
x^Ty =
\langle x,y\rangle.
$$

They also preserve norms:

$$
\|Qx\|_2=\|x\|_2.
$$

Therefore orthogonal matrices represent rigid linear transformations: rotations, reflections, and combinations of them.

They do not stretch or shrink Euclidean length.

## 49.10 Unitary Matrices

In complex vector spaces, the analogue of an orthogonal matrix is a unitary matrix.

A square complex matrix \(U\) is unitary if

$$
U^*U=I.
$$

Then

$$
U^{-1}=U^*.
$$

Unitary matrices preserve complex inner products:

$$
\langle Ux,Uy\rangle = \langle x,y\rangle.
$$

They also preserve Euclidean norm:

$$
\|Ux\|_2=\|x\|_2.
$$

Unitary matrices appear throughout spectral theory, quantum mechanics, Fourier analysis, and numerical linear algebra. The discrete Fourier transform matrix, after proper normalization, is unitary.

## 49.11 Projection onto an Orthonormal Basis

Let \(W\) be a subspace with orthonormal basis

$$
q_1,\ldots,q_k.
$$

The orthogonal projection of \(v\) onto \(W\) is

$$
\operatorname{proj}_W(v) =
\sum_{j=1}^k \langle v,q_j\rangle q_j.
$$

This formula follows from the coordinate formula, applied only to the subspace \(W\).

The residual

$$
r = v-\operatorname{proj}_W(v)
$$

is orthogonal to every basis vector \(q_i\):

$$
\langle r,q_i\rangle =
\langle v,q_i\rangle -
\sum_{j=1}^k
\langle v,q_j\rangle
\langle q_j,q_i\rangle =
\langle v,q_i\rangle-\langle v,q_i\rangle =
0.
$$

Therefore

$$
r\in W^\perp.
$$

Projection with an orthonormal basis avoids the matrix inverse appearing in the general projection formula.

## 49.12 Projection Matrix

Let \(Q\) be an \(m\times k\) matrix with orthonormal columns. Thus

$$
Q^TQ=I_k.
$$

The projection of \(b\in\mathbb{R}^m\) onto \(\operatorname{Col}(Q)\) is

$$
p = QQ^Tb.
$$

Hence the projection matrix is

$$
P = QQ^T.
$$

This is simpler than the general formula

$$
P=A(A^TA)^{-1}A^T.
$$

The simplification occurs because \(Q^TQ=I\). Orthonormal columns remove the need to invert the Gram matrix.

The matrix \(P=QQ^T\) satisfies

$$
P^2=P
$$

and

$$
P^T=P.
$$

Thus it is an orthogonal projection matrix.

## 49.13 Parseval Identity

Let \(q_1,\ldots,q_n\) be an orthonormal basis for \(V\). If

$$
v =
\sum_{j=1}^n c_jq_j,
$$

then

$$
\|v\|^2 =
|c_1|^2+\cdots+|c_n|^2.
$$

Since

$$
c_j=\langle v,q_j\rangle,
$$

we have

$$
\|v\|^2 =
\sum_{j=1}^n |\langle v,q_j\rangle|^2.
$$

This is Parseval's identity in finite-dimensional form.

It says that the squared length of a vector equals the sum of the squared magnitudes of its orthonormal coordinates.

In \(\mathbb{R}^n\), this generalizes the usual formula

$$
\|x\|_2^2=x_1^2+\cdots+x_n^2.
$$

## 49.14 Bessel Inequality

If \(q_1,\ldots,q_k\) is an orthonormal set, but not necessarily a basis for all of \(V\), then

$$
\sum_{j=1}^k |\langle v,q_j\rangle|^2
\le
\|v\|^2.
$$

This is Bessel's inequality.

It says that the energy captured by projection onto the span of \(q_1,\ldots,q_k\) cannot exceed the total energy of \(v\).

The inequality becomes an equality exactly when \(v\) lies in the span of the orthonormal set. In that case, the set captures all of \(v\).

## 49.15 Distance to a Subspace

Let \(W\) have orthonormal basis \(q_1,\ldots,q_k\). The closest vector in \(W\) to \(v\) is

$$
p =
\sum_{j=1}^k \langle v,q_j\rangle q_j.
$$

The distance from \(v\) to \(W\) is

$$
\operatorname{dist}(v,W)=\|v-p\|.
$$

Using orthogonal decomposition,

$$
v=p+r,
\qquad
p\in W,
\qquad
r\in W^\perp.
$$

Then

$$
\|v\|^2=\|p\|^2+\|r\|^2.
$$

Since

$$
\|p\|^2=
\sum_{j=1}^k |\langle v,q_j\rangle|^2,
$$

we obtain

$$
\operatorname{dist}(v,W)^2 =
\|v\|^2 -
\sum_{j=1}^k |\langle v,q_j\rangle|^2.
$$

This formula is useful in least squares and approximation.

## 49.16 Change of Orthonormal Basis

Let \(B=(q_1,\ldots,q_n)\) and \(C=(r_1,\ldots,r_n)\) be two orthonormal bases of a real inner product space. The change-of-basis matrix from \(B\)-coordinates to \(C\)-coordinates is orthogonal.

Indeed, both coordinate systems preserve inner products and lengths. Therefore the transformation between them also preserves inner products and lengths.

In matrix form, if \(Q\) and \(R\) are the matrices with columns \(q_i\) and \(r_i\), then

$$
Q^TQ=I,
\qquad
R^TR=I.
$$

The change-of-basis matrix is

$$
R^TQ.
$$

It is orthogonal because

$$
(R^TQ)^T(R^TQ) =
Q^TRR^TQ =
Q^TQ =
I.
$$

Thus moving between orthonormal coordinate systems is numerically stable and geometrically rigid.

## 49.17 Orthonormal Bases and Least Squares

In least squares, one often wants to approximate \(b\) by a vector in the column space of a matrix.

If the columns of \(A\) are not orthonormal, the projection formula is

$$
p=A(A^TA)^{-1}A^Tb.
$$

If the columns are orthonormal, write the matrix as \(Q\). Then

$$
p=QQ^Tb.
$$

The least squares coefficients are

$$
\hat{x}=Q^Tb.
$$

This is simpler and more stable than solving the normal equations.

This observation motivates the QR factorization. Instead of working directly with \(A\), we factor it as

$$
A=QR,
$$

where \(Q\) has orthonormal columns and \(R\) is upper triangular. The orthonormal factor \(Q\) carries the geometry of the column space.

## 49.18 Orthonormal Bases in Function Spaces

Orthonormal bases also occur in spaces of functions.

For functions on an interval \([a,b]\), an inner product may be defined by

$$
\langle f,g\rangle =
\int_a^b f(x)g(x)\,dx.
$$

A sequence of functions \(\phi_1,\phi_2,\ldots\) is orthonormal if

$$
\langle \phi_i,\phi_j\rangle=\delta_{ij}.
$$

For example, trigonometric functions form orthogonal and, after scaling, orthonormal systems on intervals such as \([-\pi,\pi]\). In Fourier analysis, a function is represented by coefficients obtained from inner products:

$$
c_j=\langle f,\phi_j\rangle.
$$

This is the infinite-dimensional analogue of coordinates in an orthonormal basis.

## 49.19 Numerical Importance

Orthonormal bases are important in numerical computation because they control error.

When a matrix has orthonormal columns,

$$
Q^TQ=I,
$$

so multiplying by \(Q\) does not amplify Euclidean length. This gives stable algorithms for projections, least squares, eigenvalue computations, and matrix factorizations.

In contrast, a poorly conditioned basis may distort coordinates. Small errors in a vector can become large errors in its coordinate representation.

For this reason, numerical linear algebra often replaces arbitrary bases by orthonormal bases. The Gram-Schmidt process, Householder reflections, and Givens rotations are standard methods for doing this.

## 49.20 Summary

An orthonormal basis is a basis \(q_1,\ldots,q_n\) satisfying

$$
\langle q_i,q_j\rangle=\delta_{ij}.
$$

It gives the expansion

$$
v=\sum_{j=1}^n \langle v,q_j\rangle q_j.
$$

Thus coordinates are obtained by inner products.

If \(Q\) is the matrix whose columns are an orthonormal basis, then

$$
Q^TQ=I.
$$

When \(Q\) is square,

$$
Q^{-1}=Q^T.
$$

Orthonormal bases simplify projection, least squares, coordinate changes, and norm computations. They preserve geometry and improve numerical stability.