Skip to content

Chapter 32. Linear Transformations

A linear transformation is a function between vector spaces that preserves the two basic operations of linear algebra: vector addition and scalar multiplication. If VV and WW are vector spaces over the same field FF, a function

T:VW T : V \to W

is called linear when, for all u,vVu, v \in V and all cFc \in F,

T(u+v)=T(u)+T(v) T(u+v)=T(u)+T(v)

and

T(cv)=cT(v). T(cv)=cT(v).

These two identities are the defining conditions. They say that applying TT after forming a sum gives the same result as forming the sum after applying TT, and applying TT after scaling gives the same result as scaling after applying TT. This is the standard definition used in linear algebra texts: linear maps preserve sums and scalar multiplication.

32.1 Functions Between Vector Spaces

A transformation is a function. It assigns to each vector in one space exactly one vector in another space.

If

T:VW, T : V \to W,

then VV is called the domain of TT, and WW is called the codomain of TT. For each vVv \in V, the vector T(v)T(v) lies in WW.

For example, define

T:R2R2 T : \mathbb{R}^2 \to \mathbb{R}^2

by

T[xy]=[2x3y]. T \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 2x \\ 3y \end{bmatrix}.

This transformation doubles the first coordinate and triples the second coordinate.

It is linear because

T(u+v)=T(u)+T(v) T(u+v)=T(u)+T(v)

and

T(cu)=cT(u) T(cu)=cT(u)

for all u,vR2u,v \in \mathbb{R}^2 and all scalars cc.

32.2 The Definition of Linearity

The two defining conditions may be combined into one condition.

A function T:VWT : V \to W is linear if and only if

T(au+bv)=aT(u)+bT(v) T(au+bv)=aT(u)+bT(v)

for all u,vVu,v \in V and all scalars a,bFa,b \in F.

This condition says that TT preserves every linear combination of two vectors. By repeated use, it preserves every finite linear combination:

T(c1v1+c2v2++ckvk)=c1T(v1)+c2T(v2)++ckT(vk). T(c_1v_1+c_2v_2+\cdots+c_kv_k) = c_1T(v_1)+c_2T(v_2)+\cdots+c_kT(v_k).

This is the central practical meaning of linearity. Once a vector is expressed as a linear combination, its image under TT is found by applying TT to the pieces and using the same coefficients.

32.3 First Examples

Scaling

Define

T:R2R2 T : \mathbb{R}^2 \to \mathbb{R}^2

by

T(v)=3v. T(v)=3v.

Then

T(u+v)=3(u+v)=3u+3v=T(u)+T(v), T(u+v)=3(u+v)=3u+3v=T(u)+T(v),

and

T(cu)=3(cu)=c(3u)=cT(u). T(cu)=3(cu)=c(3u)=cT(u).

Thus TT is linear.

Projection Onto an Axis

Define

P:R2R2 P : \mathbb{R}^2 \to \mathbb{R}^2

by

P[xy]=[x0]. P \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x \\ 0 \end{bmatrix}.

This transformation sends every vector to its horizontal component. It is linear.

If

u=[x1y1],v=[x2y2], u= \begin{bmatrix} x_1 \\ y_1 \end{bmatrix}, \qquad v= \begin{bmatrix} x_2 \\ y_2 \end{bmatrix},

then

P(u+v)=P[x1+x2y1+y2]=[x1+x20]=[x10]+[x20]=P(u)+P(v). P(u+v) = P \begin{bmatrix} x_1+x_2 \\ y_1+y_2 \end{bmatrix} = \begin{bmatrix} x_1+x_2 \\ 0 \end{bmatrix} = \begin{bmatrix} x_1 \\ 0 \end{bmatrix} + \begin{bmatrix} x_2 \\ 0 \end{bmatrix} = P(u)+P(v).

Also,

P(cu)=P[cx1cy1]=[cx10]=c[x10]=cP(u). P(cu) = P \begin{bmatrix} cx_1 \\ cy_1 \end{bmatrix} = \begin{bmatrix} cx_1 \\ 0 \end{bmatrix} = c \begin{bmatrix} x_1 \\ 0 \end{bmatrix} = cP(u).

Differentiation

Let PnP_n be the vector space of polynomials of degree at most nn. Define

D:PnPn1 D : P_n \to P_{n-1}

by

D(p)=p. D(p)=p'.

The derivative operator is linear because

D(p+q)=(p+q)=p+q=D(p)+D(q), D(p+q)=(p+q)'=p'+q'=D(p)+D(q),

and

D(cp)=(cp)=cp=cD(p). D(cp)=(cp)'=cp'=cD(p).

This example shows that vectors do not have to be lists of numbers. They may be polynomials, functions, or other mathematical objects.

32.4 Nonlinear Transformations

A transformation may fail to be linear in several ways.

Define

T:RR T : \mathbb{R} \to \mathbb{R}

by

T(x)=x2. T(x)=x^2.

Then

T(x+y)=(x+y)2=x2+2xy+y2, T(x+y)=(x+y)^2=x^2+2xy+y^2,

while

T(x)+T(y)=x2+y2. T(x)+T(y)=x^2+y^2.

These are not equal in general. Therefore TT is not linear.

Define another transformation

S:RR S : \mathbb{R} \to \mathbb{R}

by

S(x)=x+1. S(x)=x+1.

Then

S(0)=1. S(0)=1.

A linear transformation must send the zero vector to the zero vector. Since S(0)0S(0)\neq 0, the transformation SS is not linear.

This gives a quick test: if T(0)0T(0)\neq 0, then TT cannot be linear.

32.5 The Zero Vector Is Preserved

Let T:VWT : V \to W be linear. Then

T(0V)=0W. T(0_V)=0_W.

To prove this, use scalar multiplication:

0V=0v 0_V = 0 \cdot v

for any vVv \in V. Therefore,

T(0V)=T(0v)=0T(v)=0W. T(0_V)=T(0v)=0T(v)=0_W.

Thus every linear transformation sends the zero vector of its domain to the zero vector of its codomain.

This property is necessary, but not sufficient. A function may send zero to zero and still fail to be linear.

For example,

T(x)=x2 T(x)=x^2

satisfies T(0)=0T(0)=0, but it is not linear.

32.6 Negatives Are Preserved

If T:VWT : V \to W is linear, then

T(v)=T(v). T(-v)=-T(v).

Indeed,

T(v)=T((1)v)=(1)T(v)=T(v). T(-v)=T((-1)v)=(-1)T(v)=-T(v).

Therefore a linear transformation preserves additive inverses.

This also implies

T(uv)=T(u)T(v). T(u-v)=T(u)-T(v).

The proof is direct:

T(uv)=T(u+(v))=T(u)+T(v)=T(u)T(v). T(u-v)=T(u+(-v))=T(u)+T(-v)=T(u)-T(v).

32.7 Linear Transformations and Bases

A linear transformation is determined by its values on a basis.

Let VV have basis

B=(v1,v2,,vn). B=(v_1,v_2,\ldots,v_n).

Every vector vVv \in V has a unique expression

v=c1v1+c2v2++cnvn. v=c_1v_1+c_2v_2+\cdots+c_nv_n.

If T:VWT : V \to W is linear, then

T(v)=c1T(v1)+c2T(v2)++cnT(vn). T(v)=c_1T(v_1)+c_2T(v_2)+\cdots+c_nT(v_n).

Thus, once the images

T(v1),T(v2),,T(vn) T(v_1),T(v_2),\ldots,T(v_n)

are known, the value of TT on every vector is known.

This is one of the main reasons linear transformations are manageable. A function on an infinite set may appear complicated, but a linear function on a finite-dimensional vector space is completely described by finitely many vectors.

32.8 Matrix Representation

Every matrix defines a linear transformation.

Let AA be an m×nm \times n matrix. Define

TA:RnRm T_A : \mathbb{R}^n \to \mathbb{R}^m

by

TA(x)=Ax. T_A(x)=Ax.

Then TAT_A is linear because matrix multiplication satisfies

A(u+v)=Au+Av A(u+v)=Au+Av

and

A(cu)=cAu. A(cu)=cAu.

Conversely, every linear transformation

T:RnRm T : \mathbb{R}^n \to \mathbb{R}^m

can be represented by an m×nm \times n matrix.

Let e1,e2,,ene_1,e_2,\ldots,e_n be the standard basis of Rn\mathbb{R}^n. The matrix of TT has columns

T(e1),T(e2),,T(en). T(e_1), T(e_2), \ldots, T(e_n).

Thus

A=[T(e1)T(e2)T(en)]. A= \begin{bmatrix} | & | & & | \\ T(e_1) & T(e_2) & \cdots & T(e_n) \\ | & | & & | \end{bmatrix}.

Then

T(x)=Ax T(x)=Ax

for every xRnx \in \mathbb{R}^n.

32.9 Example of a Matrix Transformation

Let

A=[2103]. A= \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix}.

The associated linear transformation is

T(x)=Ax. T(x)=Ax.

For

x=[x1x2], x= \begin{bmatrix} x_1 \\ x_2 \end{bmatrix},

we have

T(x)=[2103][x1x2]=[2x1+x23x2]. T(x) = \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 2x_1+x_2 \\ 3x_2 \end{bmatrix}.

The first coordinate of the output is a linear combination of the input coordinates. The second coordinate is another linear combination. This pattern always occurs for matrix transformations.

32.10 Kernel

The kernel of a linear transformation is the set of all vectors sent to zero.

If T:VWT : V \to W is linear, then

ker(T)={vV:T(v)=0W}. \ker(T)=\{v \in V : T(v)=0_W\}.

The kernel measures the directions that the transformation collapses.

For a matrix transformation TA(x)=AxT_A(x)=Ax,

ker(TA)={x:Ax=0}. \ker(T_A)=\{x : Ax=0\}.

This is the null space of AA.

The kernel is a subspace of VV. To prove this, let u,vker(T)u,v \in \ker(T) and let cc be a scalar. Then

T(u+v)=T(u)+T(v)=0+0=0, T(u+v)=T(u)+T(v)=0+0=0,

so u+vker(T)u+v \in \ker(T). Also,

T(cu)=cT(u)=c0=0, T(cu)=cT(u)=c0=0,

so cuker(T)cu \in \ker(T).

Therefore ker(T)\ker(T) is closed under addition and scalar multiplication.

32.11 Image

The image of a linear transformation is the set of all vectors that occur as outputs.

If T:VWT : V \to W is linear, then

im(T)={T(v):vV}. \operatorname{im}(T)=\{T(v):v\in V\}.

The image is also called the range of TT.

For a matrix transformation TA(x)=AxT_A(x)=Ax, the image is the column space of AA. It consists of all linear combinations of the columns of AA.

The image is a subspace of WW. If y1,y2im(T)y_1,y_2 \in \operatorname{im}(T), then there are vectors v1,v2Vv_1,v_2 \in V such that

y1=T(v1),y2=T(v2). y_1=T(v_1), \qquad y_2=T(v_2).

Then

y1+y2=T(v1)+T(v2)=T(v1+v2), y_1+y_2=T(v_1)+T(v_2)=T(v_1+v_2),

so y1+y2im(T)y_1+y_2\in \operatorname{im}(T). Also,

cy1=cT(v1)=T(cv1), cy_1=cT(v_1)=T(cv_1),

so cy1im(T)cy_1\in \operatorname{im}(T).

32.12 Injective Transformations

A transformation T:VWT : V \to W is injective if different input vectors always have different outputs.

That is,

T(u)=T(v) T(u)=T(v)

implies

u=v. u=v.

For linear transformations, injectivity is controlled by the kernel.

A linear transformation T:VWT : V \to W is injective if and only if

ker(T)={0}. \ker(T)=\{0\}.

To prove this, suppose TT is injective. Since T(0)=0T(0)=0, the only vector that can map to 00 is 00. Hence ker(T)={0}\ker(T)=\{0\}.

Conversely, suppose ker(T)={0}\ker(T)=\{0\}. If T(u)=T(v)T(u)=T(v), then

T(u)T(v)=0. T(u)-T(v)=0.

By linearity,

T(uv)=0. T(u-v)=0.

Thus

uvker(T). u-v \in \ker(T).

Since the kernel contains only 00,

uv=0, u-v=0,

so

u=v. u=v.

Therefore TT is injective.

32.13 Surjective Transformations

A transformation T:VWT : V \to W is surjective if every vector in WW is hit by TT.

That is, for every wWw \in W, there exists vVv \in V such that

T(v)=w. T(v)=w.

In terms of the image, this says

im(T)=W. \operatorname{im}(T)=W.

For a matrix transformation TA:RnRmT_A:\mathbb{R}^n\to\mathbb{R}^m, surjectivity means that the columns of AA span Rm\mathbb{R}^m.

Thus TAT_A is surjective exactly when the column space of AA is all of the codomain.

32.14 Isomorphisms

A linear transformation T:VWT : V \to W is an isomorphism if it is both injective and surjective.

In this case, every vector in WW is the image of exactly one vector in VV. The inverse function

T1:WV T^{-1}:W\to V

exists and is also linear.

When an isomorphism exists between VV and WW, the two spaces have the same linear structure. They may have different descriptions, but they are equivalent as vector spaces.

For example, the vector space of polynomials of degree at most 22 is isomorphic to R3\mathbb{R}^3. The map

a+bx+cx2[abc] a+bx+cx^2 \mapsto \begin{bmatrix} a \\ b \\ c \end{bmatrix}

is linear, injective, and surjective.

32.15 Composition

If

T:UV T : U \to V

and

S:VW S : V \to W

are linear transformations, then their composition

ST:UW S \circ T : U \to W

is also linear.

Indeed,

(ST)(u+v)=S(T(u+v)). (S\circ T)(u+v)=S(T(u+v)).

Since TT is linear,

T(u+v)=T(u)+T(v). T(u+v)=T(u)+T(v).

Therefore,

S(T(u+v))=S(T(u)+T(v)). S(T(u+v))=S(T(u)+T(v)).

Since SS is linear,

S(T(u)+T(v))=S(T(u))+S(T(v)). S(T(u)+T(v))=S(T(u))+S(T(v)).

Hence

(ST)(u+v)=(ST)(u)+(ST)(v). (S\circ T)(u+v)=(S\circ T)(u)+(S\circ T)(v).

The scalar condition is similar:

(ST)(cu)=S(T(cu))=S(cT(u))=cS(T(u))=c(ST)(u). (S\circ T)(cu)=S(T(cu))=S(cT(u))=cS(T(u))=c(S\circ T)(u).

Thus composition preserves linearity.

In matrix form, composition corresponds to matrix multiplication. If T(x)=AxT(x)=Ax and S(y)=ByS(y)=By, then

(ST)(x)=B(Ax)=(BA)x. (S\circ T)(x)=B(Ax)=(BA)x.

32.16 Inverses

A linear transformation T:VWT : V \to W has an inverse exactly when it is an isomorphism.

If TT has inverse T1T^{-1}, then

T1(T(v))=v T^{-1}(T(v))=v

for all vVv\in V, and

T(T1(w))=w T(T^{-1}(w))=w

for all wWw\in W.

For matrix transformations, this corresponds to inverse matrices. If

T(x)=Ax T(x)=Ax

and AA is invertible, then

T1(x)=A1x. T^{-1}(x)=A^{-1}x.

If AA is singular, then the transformation cannot be reversed on all of its codomain.

32.17 Rank and Nullity

For a linear transformation T:VWT : V \to W, the rank of TT is the dimension of its image:

rank(T)=dim(im(T)). \operatorname{rank}(T)=\dim(\operatorname{im}(T)).

The nullity of TT is the dimension of its kernel:

nullity(T)=dim(ker(T)). \operatorname{nullity}(T)=\dim(\ker(T)).

If VV is finite-dimensional, then the rank-nullity theorem states

dim(V)=rank(T)+nullity(T). \dim(V)=\operatorname{rank}(T)+\operatorname{nullity}(T).

This theorem says that the dimension of the domain splits into two parts: the part visible in the output and the part collapsed to zero.

32.18 Geometric Interpretation

Linear transformations preserve the linear structure of space.

They send lines through the origin to lines through the origin or to the zero vector. They send planes through the origin to planes, lines, or the zero vector. More generally, they send subspaces to subspaces.

They may stretch, shrink, rotate, reflect, shear, project, or collapse dimensions. They cannot translate the origin away from itself. A translation such as

T(x)=x+b T(x)=x+b

with b0b\neq 0 is affine, not linear.

For example,

T[xy]=[x+1y] T \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x+1 \\ y \end{bmatrix}

moves every point one unit to the right. It does not preserve the zero vector, so it is not linear.

32.19 Standard Transformations in the Plane

Several important transformations of R2\mathbb{R}^2 are linear.

A scaling transformation has matrix

[a00b]. \begin{bmatrix} a & 0 \\ 0 & b \end{bmatrix}.

It sends

[xy] \begin{bmatrix} x \\ y \end{bmatrix}

to

[axby]. \begin{bmatrix} ax \\ by \end{bmatrix}.

A reflection across the xx-axis has matrix

[1001]. \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}.

A projection onto the xx-axis has matrix

[1000]. \begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}.

A shear has matrix

[1k01]. \begin{bmatrix} 1 & k \\ 0 & 1 \end{bmatrix}.

It sends

[xy] \begin{bmatrix} x \\ y \end{bmatrix}

to

[x+kyy]. \begin{bmatrix} x+ky \\ y \end{bmatrix}.

Each example is linear because the output coordinates are linear expressions in the input coordinates.

32.20 Testing Linearity

To test whether a transformation is linear, check the defining identities.

For a transformation T:VWT : V \to W, verify that

T(u+v)=T(u)+T(v) T(u+v)=T(u)+T(v)

and

T(cu)=cT(u) T(cu)=cT(u)

for arbitrary vectors u,vu,v and arbitrary scalar cc.

A faster test is to check the combined condition

T(au+bv)=aT(u)+bT(v). T(au+bv)=aT(u)+bT(v).

Common signs of nonlinearity include:

FeatureExample
Constant shiftT(x)=x+1T(x)=x+1
Powers of variablesT(x)=x2T(x)=x^2
Products of variablesT(x,y)=xyT(x,y)=xy
Absolute values(T(x)=
Trigonometric functionsT(x)=sinxT(x)=\sin x

These transformations may be important in other parts of mathematics, but they are not linear transformations.

32.21 Coordinate Form

Let

T:RnRm T:\mathbb{R}^n\to\mathbb{R}^m

be a transformation. If each component of T(x)T(x) is a linear expression in the coordinates of xx, then TT is linear.

For example,

T(x1,x2,x3)=(2x1x3,  x1+4x2,  x2+5x3) T(x_1,x_2,x_3)=(2x_1-x_3,\; x_1+4x_2,\; -x_2+5x_3)

is linear.

Its matrix is

A=[201140015]. A= \begin{bmatrix} 2 & 0 & -1 \\ 1 & 4 & 0 \\ 0 & -1 & 5 \end{bmatrix}.

Then

T(x)=Ax. T(x)=Ax.

The coefficients of the coordinate formulas become the entries of the matrix.

32.22 Linear Transformations as Structure-Preserving Maps

The word linear does not mean merely that a formula contains straight lines. It means that the transformation preserves the algebraic structure of a vector space.

It preserves sums:

T(u+v)=T(u)+T(v). T(u+v)=T(u)+T(v).

It preserves scalar multiples:

T(cv)=cT(v). T(cv)=cT(v).

It preserves linear combinations:

T(i=1kcivi)=i=1kciT(vi). T\left(\sum_{i=1}^{k} c_iv_i\right) = \sum_{i=1}^{k} c_iT(v_i).

It preserves the zero vector:

T(0)=0. T(0)=0.

It preserves subspaces by sending them to subspaces of the codomain.

These properties make linear transformations the natural maps in linear algebra, just as continuous functions are natural maps in topology and homomorphisms are natural maps in algebra.

32.23 Summary

A linear transformation is a function between vector spaces that preserves addition and scalar multiplication. The two defining identities are

T(u+v)=T(u)+T(v) T(u+v)=T(u)+T(v)

and

T(cv)=cT(v). T(cv)=cT(v).

Every matrix gives a linear transformation by T(x)=AxT(x)=Ax. Conversely, every linear transformation between finite-dimensional coordinate spaces has a matrix representation. The columns of the matrix are the images of the standard basis vectors.

The kernel consists of all vectors sent to zero. The image consists of all vectors reached by the transformation. Injectivity is equivalent to having trivial kernel, and surjectivity is equivalent to having image equal to the codomain.

Linear transformations are central because they connect algebra, geometry, and computation. They describe the maps that preserve the structure of vector spaces.