Chapter 32. Linear Transformations

A linear transformation is a function between vector spaces that preserves the two basic operations of linear algebra: vector addition and scalar multiplication. If $V$ and $W$ are vector spaces over the same field $F$ , a function

T : V \to W

is called linear when, for all $u, v \in V$ and all $c \in F$ ,

T(u+v)=T(u)+T(v)

and

T(cv)=cT(v).

These two identities are the defining conditions. They say that applying $T$ after forming a sum gives the same result as forming the sum after applying $T$ , and applying $T$ after scaling gives the same result as scaling after applying $T$ . This is the standard definition used in linear algebra texts: linear maps preserve sums and scalar multiplication.

32.1 Functions Between Vector Spaces

A transformation is a function. It assigns to each vector in one space exactly one vector in another space.

T : V \to W,

then $V$ is called the domain of $T$ , and $W$ is called the codomain of $T$ . For each $v \in V$ , the vector $T(v)$ lies in $W$ .

For example, define

T : \mathbb{R}^2 \to \mathbb{R}^2

T \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 2x \\ 3y \end{bmatrix}.

This transformation doubles the first coordinate and triples the second coordinate.

It is linear because

T(u+v)=T(u)+T(v)

and

T(cu)=cT(u)

for all $u,v \in \mathbb{R}^2$ and all scalars $c$ .

32.2 The Definition of Linearity

The two defining conditions may be combined into one condition.

A function $T : V \to W$ is linear if and only if

T(au+bv)=aT(u)+bT(v)

for all $u,v \in V$ and all scalars $a,b \in F$ .

This condition says that $T$ preserves every linear combination of two vectors. By repeated use, it preserves every finite linear combination:

T(c_1v_1+c_2v_2+\cdots+c_kv_k) = c_1T(v_1)+c_2T(v_2)+\cdots+c_kT(v_k).

This is the central practical meaning of linearity. Once a vector is expressed as a linear combination, its image under $T$ is found by applying $T$ to the pieces and using the same coefficients.

32.3 First Examples

Scaling

Define

T : \mathbb{R}^2 \to \mathbb{R}^2

T(v)=3v.

Then

T(u+v)=3(u+v)=3u+3v=T(u)+T(v),

and

T(cu)=3(cu)=c(3u)=cT(u).

Thus $T$ is linear.

Projection Onto an Axis

Define

P : \mathbb{R}^2 \to \mathbb{R}^2

P \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x \\ 0 \end{bmatrix}.

This transformation sends every vector to its horizontal component. It is linear.

u= \begin{bmatrix} x_1 \\ y_1 \end{bmatrix}, \qquad v= \begin{bmatrix} x_2 \\ y_2 \end{bmatrix},

then

P(u+v) = P \begin{bmatrix} x_1+x_2 \\ y_1+y_2 \end{bmatrix} = \begin{bmatrix} x_1+x_2 \\ 0 \end{bmatrix} = \begin{bmatrix} x_1 \\ 0 \end{bmatrix} + \begin{bmatrix} x_2 \\ 0 \end{bmatrix} = P(u)+P(v).

Also,

P(cu) = P \begin{bmatrix} cx_1 \\ cy_1 \end{bmatrix} = \begin{bmatrix} cx_1 \\ 0 \end{bmatrix} = c \begin{bmatrix} x_1 \\ 0 \end{bmatrix} = cP(u).

Differentiation

Let $P_n$ be the vector space of polynomials of degree at most $n$ . Define

D : P_n \to P_{n-1}

D(p)=p'.

The derivative operator is linear because

D(p+q)=(p+q)'=p'+q'=D(p)+D(q),

and

D(cp)=(cp)'=cp'=cD(p).

This example shows that vectors do not have to be lists of numbers. They may be polynomials, functions, or other mathematical objects.

32.4 Nonlinear Transformations

A transformation may fail to be linear in several ways.

Define

T : \mathbb{R} \to \mathbb{R}

T(x)=x^2.

Then

T(x+y)=(x+y)^2=x^2+2xy+y^2,

while

T(x)+T(y)=x^2+y^2.

These are not equal in general. Therefore $T$ is not linear.

Define another transformation

S : \mathbb{R} \to \mathbb{R}

S(x)=x+1.

Then

S(0)=1.

A linear transformation must send the zero vector to the zero vector. Since $S(0)\neq 0$ , the transformation $S$ is not linear.

This gives a quick test: if $T(0)\neq 0$ , then $T$ cannot be linear.

32.5 The Zero Vector Is Preserved

Let $T : V \to W$ be linear. Then

T(0_V)=0_W.

To prove this, use scalar multiplication:

0_V = 0 \cdot v

for any $v \in V$ . Therefore,

T(0_V)=T(0v)=0T(v)=0_W.

Thus every linear transformation sends the zero vector of its domain to the zero vector of its codomain.

This property is necessary, but not sufficient. A function may send zero to zero and still fail to be linear.

For example,

T(x)=x^2

satisfies $T(0)=0$ , but it is not linear.

32.6 Negatives Are Preserved

If $T : V \to W$ is linear, then

T(-v)=-T(v).

Indeed,

T(-v)=T((-1)v)=(-1)T(v)=-T(v).

Therefore a linear transformation preserves additive inverses.

This also implies

T(u-v)=T(u)-T(v).

The proof is direct:

T(u-v)=T(u+(-v))=T(u)+T(-v)=T(u)-T(v).

32.7 Linear Transformations and Bases

A linear transformation is determined by its values on a basis.

Let $V$ have basis

B=(v_1,v_2,\ldots,v_n).

Every vector $v \in V$ has a unique expression

v=c_1v_1+c_2v_2+\cdots+c_nv_n.

If $T : V \to W$ is linear, then

T(v)=c_1T(v_1)+c_2T(v_2)+\cdots+c_nT(v_n).

Thus, once the images

T(v_1),T(v_2),\ldots,T(v_n)

are known, the value of $T$ on every vector is known.

This is one of the main reasons linear transformations are manageable. A function on an infinite set may appear complicated, but a linear function on a finite-dimensional vector space is completely described by finitely many vectors.

32.8 Matrix Representation

Every matrix defines a linear transformation.

Let $A$ be an $m \times n$ matrix. Define

T_A : \mathbb{R}^n \to \mathbb{R}^m

T_A(x)=Ax.

Then $T_A$ is linear because matrix multiplication satisfies

A(u+v)=Au+Av

and

A(cu)=cAu.

Conversely, every linear transformation

T : \mathbb{R}^n \to \mathbb{R}^m

can be represented by an $m \times n$ matrix.

Let $e_1,e_2,\ldots,e_n$ be the standard basis of $\mathbb{R}^n$ . The matrix of $T$ has columns

T(e_1), T(e_2), \ldots, T(e_n).

Thus

A= \begin{bmatrix} | & | & & | \\ T(e_1) & T(e_2) & \cdots & T(e_n) \\ | & | & & | \end{bmatrix}.

Then

T(x)=Ax

for every $x \in \mathbb{R}^n$ .

32.9 Example of a Matrix Transformation

Let

A= \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix}.

The associated linear transformation is

T(x)=Ax.

For

x= \begin{bmatrix} x_1 \\ x_2 \end{bmatrix},

we have

T(x) = \begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 2x_1+x_2 \\ 3x_2 \end{bmatrix}.

The first coordinate of the output is a linear combination of the input coordinates. The second coordinate is another linear combination. This pattern always occurs for matrix transformations.

32.10 Kernel

The kernel of a linear transformation is the set of all vectors sent to zero.

If $T : V \to W$ is linear, then

\ker(T)=\{v \in V : T(v)=0_W\}.

The kernel measures the directions that the transformation collapses.

For a matrix transformation $T_A(x)=Ax$ ,

\ker(T_A)=\{x : Ax=0\}.

This is the null space of $A$ .

The kernel is a subspace of $V$ . To prove this, let $u,v \in \ker(T)$ and let $c$ be a scalar. Then

T(u+v)=T(u)+T(v)=0+0=0,

so $u+v \in \ker(T)$ . Also,

T(cu)=cT(u)=c0=0,

so $cu \in \ker(T)$ .

Therefore $\ker(T)$ is closed under addition and scalar multiplication.

32.11 Image

The image of a linear transformation is the set of all vectors that occur as outputs.

If $T : V \to W$ is linear, then

\operatorname{im}(T)=\{T(v):v\in V\}.

The image is also called the range of $T$ .

For a matrix transformation $T_A(x)=Ax$ , the image is the column space of $A$ . It consists of all linear combinations of the columns of $A$ .

The image is a subspace of $W$ . If $y_1,y_2 \in \operatorname{im}(T)$ , then there are vectors $v_1,v_2 \in V$ such that

y_1=T(v_1), \qquad y_2=T(v_2).

Then

y_1+y_2=T(v_1)+T(v_2)=T(v_1+v_2),

so $y_1+y_2\in \operatorname{im}(T)$ . Also,

cy_1=cT(v_1)=T(cv_1),

so $cy_1\in \operatorname{im}(T)$ .

32.12 Injective Transformations

A transformation $T : V \to W$ is injective if different input vectors always have different outputs.

That is,

T(u)=T(v)

implies

u=v.

For linear transformations, injectivity is controlled by the kernel.

A linear transformation $T : V \to W$ is injective if and only if

\ker(T)=\{0\}.

To prove this, suppose $T$ is injective. Since $T(0)=0$ , the only vector that can map to $0$ is $0$ . Hence $\ker(T)=\{0\}$ .

Conversely, suppose $\ker(T)=\{0\}$ . If $T(u)=T(v)$ , then

T(u)-T(v)=0.

By linearity,

T(u-v)=0.

Thus

u-v \in \ker(T).

Since the kernel contains only $0$ ,

u-v=0,

u=v.

Therefore $T$ is injective.

32.13 Surjective Transformations

A transformation $T : V \to W$ is surjective if every vector in $W$ is hit by $T$ .

That is, for every $w \in W$ , there exists $v \in V$ such that

T(v)=w.

In terms of the image, this says

\operatorname{im}(T)=W.

For a matrix transformation $T_A:\mathbb{R}^n\to\mathbb{R}^m$ , surjectivity means that the columns of $A$ span $\mathbb{R}^m$ .

Thus $T_A$ is surjective exactly when the column space of $A$ is all of the codomain.

32.14 Isomorphisms

A linear transformation $T : V \to W$ is an isomorphism if it is both injective and surjective.

In this case, every vector in $W$ is the image of exactly one vector in $V$ . The inverse function

T^{-1}:W\to V

exists and is also linear.

When an isomorphism exists between $V$ and $W$ , the two spaces have the same linear structure. They may have different descriptions, but they are equivalent as vector spaces.

For example, the vector space of polynomials of degree at most $2$ is isomorphic to $\mathbb{R}^3$ . The map

a+bx+cx^2 \mapsto \begin{bmatrix} a \\ b \\ c \end{bmatrix}

is linear, injective, and surjective.

32.15 Composition

T : U \to V

and

S : V \to W

are linear transformations, then their composition

S \circ T : U \to W

is also linear.

Indeed,

(S\circ T)(u+v)=S(T(u+v)).

Since $T$ is linear,

T(u+v)=T(u)+T(v).

Therefore,

S(T(u+v))=S(T(u)+T(v)).

Since $S$ is linear,

S(T(u)+T(v))=S(T(u))+S(T(v)).

Hence

(S\circ T)(u+v)=(S\circ T)(u)+(S\circ T)(v).

The scalar condition is similar:

(S\circ T)(cu)=S(T(cu))=S(cT(u))=cS(T(u))=c(S\circ T)(u).

Thus composition preserves linearity.

In matrix form, composition corresponds to matrix multiplication. If $T(x)=Ax$ and $S(y)=By$ , then

(S\circ T)(x)=B(Ax)=(BA)x.

32.16 Inverses

A linear transformation $T : V \to W$ has an inverse exactly when it is an isomorphism.

If $T$ has inverse $T^{-1}$ , then

T^{-1}(T(v))=v

for all $v\in V$ , and

T(T^{-1}(w))=w

for all $w\in W$ .

For matrix transformations, this corresponds to inverse matrices. If

T(x)=Ax

and $A$ is invertible, then

T^{-1}(x)=A^{-1}x.

If $A$ is singular, then the transformation cannot be reversed on all of its codomain.

32.17 Rank and Nullity

For a linear transformation $T : V \to W$ , the rank of $T$ is the dimension of its image:

\operatorname{rank}(T)=\dim(\operatorname{im}(T)).

The nullity of $T$ is the dimension of its kernel:

\operatorname{nullity}(T)=\dim(\ker(T)).

If $V$ is finite-dimensional, then the rank-nullity theorem states

\dim(V)=\operatorname{rank}(T)+\operatorname{nullity}(T).

This theorem says that the dimension of the domain splits into two parts: the part visible in the output and the part collapsed to zero.

32.18 Geometric Interpretation

Linear transformations preserve the linear structure of space.

They send lines through the origin to lines through the origin or to the zero vector. They send planes through the origin to planes, lines, or the zero vector. More generally, they send subspaces to subspaces.

They may stretch, shrink, rotate, reflect, shear, project, or collapse dimensions. They cannot translate the origin away from itself. A translation such as

T(x)=x+b

with $b\neq 0$ is affine, not linear.

For example,

T \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} x+1 \\ y \end{bmatrix}

moves every point one unit to the right. It does not preserve the zero vector, so it is not linear.

32.19 Standard Transformations in the Plane

Several important transformations of $\mathbb{R}^2$ are linear.

A scaling transformation has matrix

\begin{bmatrix} a & 0 \\ 0 & b \end{bmatrix}.

It sends

\begin{bmatrix} x \\ y \end{bmatrix}

\begin{bmatrix} ax \\ by \end{bmatrix}.

A reflection across the $x$ -axis has matrix

\begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}.

A projection onto the $x$ -axis has matrix

\begin{bmatrix} 1 & 0 \\ 0 & 0 \end{bmatrix}.

A shear has matrix

\begin{bmatrix} 1 & k \\ 0 & 1 \end{bmatrix}.

It sends

\begin{bmatrix} x \\ y \end{bmatrix}

\begin{bmatrix} x+ky \\ y \end{bmatrix}.

Each example is linear because the output coordinates are linear expressions in the input coordinates.

32.20 Testing Linearity

To test whether a transformation is linear, check the defining identities.

For a transformation $T : V \to W$ , verify that

T(u+v)=T(u)+T(v)

and

T(cu)=cT(u)

for arbitrary vectors $u,v$ and arbitrary scalar $c$ .

A faster test is to check the combined condition

T(au+bv)=aT(u)+bT(v).

Common signs of nonlinearity include:

Feature	Example
Constant shift	$T(x)=x+1$
Powers of variables	$T(x)=x^2$
Products of variables	$T(x,y)=xy$
Absolute values	(T(x)=
Trigonometric functions	$T(x)=\sin x$

These transformations may be important in other parts of mathematics, but they are not linear transformations.

32.21 Coordinate Form

Let

T:\mathbb{R}^n\to\mathbb{R}^m

be a transformation. If each component of $T(x)$ is a linear expression in the coordinates of $x$ , then $T$ is linear.

For example,

T(x_1,x_2,x_3)=(2x_1-x_3,\; x_1+4x_2,\; -x_2+5x_3)

is linear.

Its matrix is

A= \begin{bmatrix} 2 & 0 & -1 \\ 1 & 4 & 0 \\ 0 & -1 & 5 \end{bmatrix}.

Then

T(x)=Ax.

The coefficients of the coordinate formulas become the entries of the matrix.

32.22 Linear Transformations as Structure-Preserving Maps

The word linear does not mean merely that a formula contains straight lines. It means that the transformation preserves the algebraic structure of a vector space.

It preserves sums:

T(u+v)=T(u)+T(v).

It preserves scalar multiples:

T(cv)=cT(v).

It preserves linear combinations:

T\left(\sum_{i=1}^{k} c_iv_i\right) = \sum_{i=1}^{k} c_iT(v_i).

It preserves the zero vector:

T(0)=0.

It preserves subspaces by sending them to subspaces of the codomain.

These properties make linear transformations the natural maps in linear algebra, just as continuous functions are natural maps in topology and homomorphisms are natural maps in algebra.

32.23 Summary

A linear transformation is a function between vector spaces that preserves addition and scalar multiplication. The two defining identities are

T(u+v)=T(u)+T(v)

and

T(cv)=cT(v).

Every matrix gives a linear transformation by $T(x)=Ax$ . Conversely, every linear transformation between finite-dimensional coordinate spaces has a matrix representation. The columns of the matrix are the images of the standard basis vectors.

The kernel consists of all vectors sent to zero. The image consists of all vectors reached by the transformation. Injectivity is equivalent to having trivial kernel, and surjectivity is equivalent to having image equal to the codomain.

Linear transformations are central because they connect algebra, geometry, and computation. They describe the maps that preserve the structure of vector spaces.