Skip to content

Chapter 49. Orthonormal Bases

An orthonormal basis is a basis made from unit vectors that are mutually orthogonal. It is one of the most useful structures in linear algebra because it combines two properties at once: every vector can be expressed uniquely in the basis, and the coefficients are found directly by inner products.

In an ordinary basis, coordinates may require solving a linear system. In an orthonormal basis, coordinates are obtained by taking dot products. This makes orthonormal bases central in projection, least squares, Fourier analysis, numerical linear algebra, signal processing, and spectral theory. Standard references define an orthonormal set as an orthogonal set of unit vectors, and such a set is linearly independent.

49.1 Unit Vectors

Let VV be an inner product space. A vector vVv \in V is a unit vector if

v=1. \|v\| = 1.

Since the norm is induced by the inner product,

v=v,v. \|v\| = \sqrt{\langle v,v\rangle}.

Thus vv is a unit vector exactly when

v,v=1. \langle v,v\rangle = 1.

For example, in R2\mathbb{R}^2,

e1=[10],e2=[01] e_1 = \begin{bmatrix} 1\\ 0 \end{bmatrix}, \qquad e_2 = \begin{bmatrix} 0\\ 1 \end{bmatrix}

are unit vectors.

The vector

v=[34] v = \begin{bmatrix} 3\\ 4 \end{bmatrix}

is not a unit vector because

v=32+42=5. \|v\| = \sqrt{3^2+4^2}=5.

To make it a unit vector, divide by its length:

vv=15[34]=[3/54/5]. \frac{v}{\|v\|} = \frac{1}{5} \begin{bmatrix} 3\\ 4 \end{bmatrix} = \begin{bmatrix} 3/5\\ 4/5 \end{bmatrix}.

This process is called normalization.

49.2 Normalization

If v0v\ne 0, its normalization is

v^=vv. \widehat{v} = \frac{v}{\|v\|}.

Then

v^=vv=1vv=1. \|\widehat{v}\| = \left\| \frac{v}{\|v\|} \right\| = \frac{1}{\|v\|}\|v\| = 1.

Thus v^\widehat{v} is a unit vector in the same direction as vv.

Normalization changes the length of a vector but not its direction. It is the operation that converts an orthogonal basis into an orthonormal basis. If v1,,vkv_1,\ldots,v_k are nonzero orthogonal vectors, then

v1v1,v2v2,,vkvk \frac{v_1}{\|v_1\|}, \frac{v_2}{\|v_2\|}, \ldots, \frac{v_k}{\|v_k\|}

form an orthonormal set. This follows because scaling nonzero orthogonal vectors preserves orthogonality and gives each vector length one.

49.3 Orthonormal Sets

A set of vectors

{q1,q2,,qk} \{q_1,q_2,\ldots,q_k\}

is orthonormal if

qi,qj={1,i=j,0,ij. \langle q_i,q_j\rangle = \begin{cases} 1, & i=j,\\ 0, & i\ne j. \end{cases}

Equivalently,

qi,qj=δij, \langle q_i,q_j\rangle = \delta_{ij},

where δij\delta_{ij} is the Kronecker delta.

The condition contains two requirements. When i=ji=j,

qi,qi=1, \langle q_i,q_i\rangle = 1,

so every vector has length one. When iji\ne j,

qi,qj=0, \langle q_i,q_j\rangle = 0,

so distinct vectors are orthogonal.

For example,

q1=12[11],q2=12[11] q_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ 1 \end{bmatrix}, \qquad q_2 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ -1 \end{bmatrix}

form an orthonormal set in R2\mathbb{R}^2. Indeed,

q1=1,q2=1, \|q_1\|=1, \qquad \|q_2\|=1,

and

q1Tq2=12(11)=0. q_1^Tq_2 = \frac{1}{2}(1-1) = 0.

49.4 Orthonormal Bases

An orthonormal basis is an orthonormal set that is also a basis.

Thus q1,,qnq_1,\ldots,q_n form an orthonormal basis for VV if:

PropertyMeaning
SpanningEvery vector in VV is a linear combination of the qiq_i
Linear independenceThe representation is unique
OrthogonalityDistinct basis vectors have inner product zero
NormalizationEach basis vector has length one

In a finite-dimensional inner product space, any orthonormal set with dimV\dim V vectors is automatically an orthonormal basis. This follows because every orthonormal set is linearly independent, and a linearly independent set with the dimension of the space is a basis.

49.5 Orthonormal Sets Are Linearly Independent

Every orthonormal set is linearly independent.

Let

{q1,,qk} \{q_1,\ldots,q_k\}

be an orthonormal set, and suppose

c1q1++ckqk=0. c_1q_1+\cdots+c_kq_k=0.

Take the inner product with qjq_j. Then

c1q1++ckqk,qj=0,qj. \left\langle c_1q_1+\cdots+c_kq_k, q_j \right\rangle = \langle 0,q_j\rangle.

By linearity,

c1q1,qj++ckqk,qj=0. c_1\langle q_1,q_j\rangle + \cdots + c_k\langle q_k,q_j\rangle = 0.

All terms vanish except the jj-th term. Since qj,qj=1\langle q_j,q_j\rangle=1, we get

cj=0. c_j=0.

This holds for every jj. Therefore all coefficients are zero, so the set is linearly independent.

This proof is one reason orthonormal systems are algebraically clean. The inner product isolates one coordinate at a time.

49.6 Coordinates in an Orthonormal Basis

Let

B=(q1,,qn) B=(q_1,\ldots,q_n)

be an orthonormal basis for VV. Every vector vVv\in V has a unique representation

v=c1q1++cnqn. v = c_1q_1+\cdots+c_nq_n.

To find cjc_j, take the inner product with qjq_j:

v,qj=c1q1++cnqn,qj. \langle v,q_j\rangle = \left\langle c_1q_1+\cdots+c_nq_n, q_j \right\rangle.

Using orthonormality,

v,qj=cj. \langle v,q_j\rangle = c_j.

Therefore

cj=v,qj. c_j = \langle v,q_j\rangle.

Hence

v=j=1nv,qjqj. v = \sum_{j=1}^n \langle v,q_j\rangle q_j.

This is the coordinate formula for an orthonormal basis. It replaces solving a system by taking inner products. Coordinate formulas of this form are a primary advantage of orthogonal and orthonormal bases.

49.7 Coordinate Vector

If B=(q1,,qn)B=(q_1,\ldots,q_n) is an orthonormal basis, then the coordinate vector of vv relative to BB is

[v]B=[v,q1v,q2v,qn]. [v]_B = \begin{bmatrix} \langle v,q_1\rangle\\ \langle v,q_2\rangle\\ \vdots\\ \langle v,q_n\rangle \end{bmatrix}.

For example, let

q1=12[11],q2=12[11], q_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ 1 \end{bmatrix}, \qquad q_2 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1\\ -1 \end{bmatrix},

and let

v=[42]. v = \begin{bmatrix} 4\\ 2 \end{bmatrix}.

Then

v,q1=12(4+2)=32, \langle v,q_1\rangle = \frac{1}{\sqrt{2}}(4+2) = 3\sqrt{2},

and

v,q2=12(42)=2. \langle v,q_2\rangle = \frac{1}{\sqrt{2}}(4-2) = \sqrt{2}.

Thus

[v]B=[322]. [v]_B = \begin{bmatrix} 3\sqrt{2}\\ \sqrt{2} \end{bmatrix}.

Therefore

v=32q1+2q2. v = 3\sqrt{2}q_1+\sqrt{2}q_2.

49.8 Matrix Form

Let QQ be the matrix whose columns are the orthonormal vectors

q1,,qn. q_1,\ldots,q_n.

Then

Q=[q1q2qn]. Q = \begin{bmatrix} | & | & & |\\ q_1 & q_2 & \cdots & q_n\\ | & | & & | \end{bmatrix}.

The condition that the columns are orthonormal is

QTQ=I Q^TQ=I

in the real case.

In the complex case, the condition is

QQ=I, Q^*Q=I,

where QQ^* is the conjugate transpose.

If QQ is square, then

Q1=QT Q^{-1}=Q^T

in the real case, and

Q1=Q Q^{-1}=Q^*

in the complex case.

A real square matrix with orthonormal columns is called an orthogonal matrix. A complex square matrix with orthonormal columns is called a unitary matrix.

49.9 Orthogonal Matrices

A real square matrix QQ is orthogonal if

QTQ=I. Q^TQ=I.

Since QQ is square, this also implies

QQT=I. QQ^T=I.

Thus

Q1=QT. Q^{-1}=Q^T.

Orthogonal matrices preserve inner products:

Qx,Qy=(Qx)T(Qy)=xTQTQy=xTy=x,y. \langle Qx,Qy\rangle = (Qx)^T(Qy) = x^TQ^TQy = x^Ty = \langle x,y\rangle.

They also preserve norms:

Qx2=x2. \|Qx\|_2=\|x\|_2.

Therefore orthogonal matrices represent rigid linear transformations: rotations, reflections, and combinations of them.

They do not stretch or shrink Euclidean length.

49.10 Unitary Matrices

In complex vector spaces, the analogue of an orthogonal matrix is a unitary matrix.

A square complex matrix UU is unitary if

UU=I. U^*U=I.

Then

U1=U. U^{-1}=U^*.

Unitary matrices preserve complex inner products:

Ux,Uy=x,y. \langle Ux,Uy\rangle = \langle x,y\rangle.

They also preserve Euclidean norm:

Ux2=x2. \|Ux\|_2=\|x\|_2.

Unitary matrices appear throughout spectral theory, quantum mechanics, Fourier analysis, and numerical linear algebra. The discrete Fourier transform matrix, after proper normalization, is unitary.

49.11 Projection onto an Orthonormal Basis

Let WW be a subspace with orthonormal basis

q1,,qk. q_1,\ldots,q_k.

The orthogonal projection of vv onto WW is

projW(v)=j=1kv,qjqj. \operatorname{proj}_W(v) = \sum_{j=1}^k \langle v,q_j\rangle q_j.

This formula follows from the coordinate formula, applied only to the subspace WW.

The residual

r=vprojW(v) r = v-\operatorname{proj}_W(v)

is orthogonal to every basis vector qiq_i:

r,qi=v,qij=1kv,qjqj,qi=v,qiv,qi=0. \langle r,q_i\rangle = \langle v,q_i\rangle - \sum_{j=1}^k \langle v,q_j\rangle \langle q_j,q_i\rangle = \langle v,q_i\rangle-\langle v,q_i\rangle = 0.

Therefore

rW. r\in W^\perp.

Projection with an orthonormal basis avoids the matrix inverse appearing in the general projection formula.

49.12 Projection Matrix

Let QQ be an m×km\times k matrix with orthonormal columns. Thus

QTQ=Ik. Q^TQ=I_k.

The projection of bRmb\in\mathbb{R}^m onto Col(Q)\operatorname{Col}(Q) is

p=QQTb. p = QQ^Tb.

Hence the projection matrix is

P=QQT. P = QQ^T.

This is simpler than the general formula

P=A(ATA)1AT. P=A(A^TA)^{-1}A^T.

The simplification occurs because QTQ=IQ^TQ=I. Orthonormal columns remove the need to invert the Gram matrix.

The matrix P=QQTP=QQ^T satisfies

P2=P P^2=P

and

PT=P. P^T=P.

Thus it is an orthogonal projection matrix.

49.13 Parseval Identity

Let q1,,qnq_1,\ldots,q_n be an orthonormal basis for VV. If

v=j=1ncjqj, v = \sum_{j=1}^n c_jq_j,

then

v2=c12++cn2. \|v\|^2 = |c_1|^2+\cdots+|c_n|^2.

Since

cj=v,qj, c_j=\langle v,q_j\rangle,

we have

v2=j=1nv,qj2. \|v\|^2 = \sum_{j=1}^n |\langle v,q_j\rangle|^2.

This is Parseval’s identity in finite-dimensional form.

It says that the squared length of a vector equals the sum of the squared magnitudes of its orthonormal coordinates.

In Rn\mathbb{R}^n, this generalizes the usual formula

x22=x12++xn2. \|x\|_2^2=x_1^2+\cdots+x_n^2.

49.14 Bessel Inequality

If q1,,qkq_1,\ldots,q_k is an orthonormal set, but not necessarily a basis for all of VV, then

j=1kv,qj2v2. \sum_{j=1}^k |\langle v,q_j\rangle|^2 \le \|v\|^2.

This is Bessel’s inequality.

It says that the energy captured by projection onto the span of q1,,qkq_1,\ldots,q_k cannot exceed the total energy of vv.

The inequality becomes an equality exactly when vv lies in the span of the orthonormal set. In that case, the set captures all of vv.

49.15 Distance to a Subspace

Let WW have orthonormal basis q1,,qkq_1,\ldots,q_k. The closest vector in WW to vv is

p=j=1kv,qjqj. p = \sum_{j=1}^k \langle v,q_j\rangle q_j.

The distance from vv to WW is

dist(v,W)=vp. \operatorname{dist}(v,W)=\|v-p\|.

Using orthogonal decomposition,

v=p+r,pW,rW. v=p+r, \qquad p\in W, \qquad r\in W^\perp.

Then

v2=p2+r2. \|v\|^2=\|p\|^2+\|r\|^2.

Since

p2=j=1kv,qj2, \|p\|^2= \sum_{j=1}^k |\langle v,q_j\rangle|^2,

we obtain

dist(v,W)2=v2j=1kv,qj2. \operatorname{dist}(v,W)^2 = \|v\|^2 - \sum_{j=1}^k |\langle v,q_j\rangle|^2.

This formula is useful in least squares and approximation.

49.16 Change of Orthonormal Basis

Let B=(q1,,qn)B=(q_1,\ldots,q_n) and C=(r1,,rn)C=(r_1,\ldots,r_n) be two orthonormal bases of a real inner product space. The change-of-basis matrix from BB-coordinates to CC-coordinates is orthogonal.

Indeed, both coordinate systems preserve inner products and lengths. Therefore the transformation between them also preserves inner products and lengths.

In matrix form, if QQ and RR are the matrices with columns qiq_i and rir_i, then

QTQ=I,RTR=I. Q^TQ=I, \qquad R^TR=I.

The change-of-basis matrix is

RTQ. R^TQ.

It is orthogonal because

(RTQ)T(RTQ)=QTRRTQ=QTQ=I. (R^TQ)^T(R^TQ) = Q^TRR^TQ = Q^TQ = I.

Thus moving between orthonormal coordinate systems is numerically stable and geometrically rigid.

49.17 Orthonormal Bases and Least Squares

In least squares, one often wants to approximate bb by a vector in the column space of a matrix.

If the columns of AA are not orthonormal, the projection formula is

p=A(ATA)1ATb. p=A(A^TA)^{-1}A^Tb.

If the columns are orthonormal, write the matrix as QQ. Then

p=QQTb. p=QQ^Tb.

The least squares coefficients are

x^=QTb. \hat{x}=Q^Tb.

This is simpler and more stable than solving the normal equations.

This observation motivates the QR factorization. Instead of working directly with AA, we factor it as

A=QR, A=QR,

where QQ has orthonormal columns and RR is upper triangular. The orthonormal factor QQ carries the geometry of the column space.

49.18 Orthonormal Bases in Function Spaces

Orthonormal bases also occur in spaces of functions.

For functions on an interval [a,b][a,b], an inner product may be defined by

f,g=abf(x)g(x)dx. \langle f,g\rangle = \int_a^b f(x)g(x)\,dx.

A sequence of functions ϕ1,ϕ2,\phi_1,\phi_2,\ldots is orthonormal if

ϕi,ϕj=δij. \langle \phi_i,\phi_j\rangle=\delta_{ij}.

For example, trigonometric functions form orthogonal and, after scaling, orthonormal systems on intervals such as [π,π][-\pi,\pi]. In Fourier analysis, a function is represented by coefficients obtained from inner products:

cj=f,ϕj. c_j=\langle f,\phi_j\rangle.

This is the infinite-dimensional analogue of coordinates in an orthonormal basis.

49.19 Numerical Importance

Orthonormal bases are important in numerical computation because they control error.

When a matrix has orthonormal columns,

QTQ=I, Q^TQ=I,

so multiplying by QQ does not amplify Euclidean length. This gives stable algorithms for projections, least squares, eigenvalue computations, and matrix factorizations.

In contrast, a poorly conditioned basis may distort coordinates. Small errors in a vector can become large errors in its coordinate representation.

For this reason, numerical linear algebra often replaces arbitrary bases by orthonormal bases. The Gram-Schmidt process, Householder reflections, and Givens rotations are standard methods for doing this.

49.20 Summary

An orthonormal basis is a basis q1,,qnq_1,\ldots,q_n satisfying

qi,qj=δij. \langle q_i,q_j\rangle=\delta_{ij}.

It gives the expansion

v=j=1nv,qjqj. v=\sum_{j=1}^n \langle v,q_j\rangle q_j.

Thus coordinates are obtained by inner products.

If QQ is the matrix whose columns are an orthonormal basis, then

QTQ=I. Q^TQ=I.

When QQ is square,

Q1=QT. Q^{-1}=Q^T.

Orthonormal bases simplify projection, least squares, coordinate changes, and norm computations. They preserve geometry and improve numerical stability.