Skip to content

Chapter 35. Composition of Transformations

Composition is the operation of applying one transformation after another. If

T:UV T:U\to V

and

S:VW S:V\to W

are transformations, then their composition is the transformation

ST:UW S\circ T:U\to W

defined by

(ST)(u)=S(T(u)). (S\circ T)(u)=S(T(u)).

The order matters. In STS\circ T, the transformation TT is applied first, and SS is applied second. Composition of linear transformations corresponds to multiplication of their representing matrices. If T(x)=AxT(x)=Ax and S(y)=ByS(y)=By, then

(ST)(x)=B(Ax)=(BA)x. (S\circ T)(x)=B(Ax)=(BA)x.

Thus the matrix of STS\circ T is BABA. This is the standard reason matrix multiplication is defined in its usual form. It encodes composition of linear maps.

35.1 Definition of Composition

Let U,V,WU,V,W be vector spaces over the same field FF. Suppose

T:UV T:U\to V

and

S:VW S:V\to W

are transformations. Their composition is

ST:UW, S\circ T:U\to W,

where

(ST)(u)=S(T(u)). (S\circ T)(u)=S(T(u)).

The expression is read as “SS after TT.” The vector uu first enters TT. The output T(u)T(u) lies in VV, so it can be used as an input for SS. The final output S(T(u))S(T(u)) lies in WW.

Composition is defined only when the codomain of the first transformation matches the domain of the second transformation, or at least lies inside it.

35.2 Composition of Linear Transformations

If both transformations are linear, then their composition is linear.

Let

T:UV T:U\to V

and

S:VW S:V\to W

be linear. We prove that

ST:UW S\circ T:U\to W

is linear.

For u1,u2Uu_1,u_2\in U,

(ST)(u1+u2)=S(T(u1+u2)). (S\circ T)(u_1+u_2) = S(T(u_1+u_2)).

Since TT is linear,

T(u1+u2)=T(u1)+T(u2). T(u_1+u_2)=T(u_1)+T(u_2).

Therefore

S(T(u1+u2))=S(T(u1)+T(u2)). S(T(u_1+u_2)) = S(T(u_1)+T(u_2)).

Since SS is linear,

S(T(u1)+T(u2))=S(T(u1))+S(T(u2)). S(T(u_1)+T(u_2)) = S(T(u_1))+S(T(u_2)).

Hence

(ST)(u1+u2)=(ST)(u1)+(ST)(u2). (S\circ T)(u_1+u_2) = (S\circ T)(u_1)+(S\circ T)(u_2).

For scalar multiplication, let cFc\in F. Then

(ST)(cu)=S(T(cu)). (S\circ T)(cu) = S(T(cu)).

Since TT is linear,

T(cu)=cT(u). T(cu)=cT(u).

Since SS is linear,

S(cT(u))=cS(T(u)). S(cT(u))=cS(T(u)).

Therefore

(ST)(cu)=c(ST)(u). (S\circ T)(cu)=c(S\circ T)(u).

So STS\circ T is linear.

35.3 Order of Application

The notation

ST S\circ T

means first apply TT, then apply SS. This order can be confusing because the symbol SS appears on the left but acts second.

The reason is that functions act on what appears to their right:

(ST)(u)=S(T(u)). (S\circ T)(u)=S(T(u)).

The innermost operation is done first. This is the same convention used in ordinary function composition.

For example, let

T(x)=2x T(x)=2x

and

S(x)=x+3. S(x)=x+3.

Then

(ST)(x)=S(2x)=2x+3, (S\circ T)(x)=S(2x)=2x+3,

while

(TS)(x)=T(x+3)=2x+6. (T\circ S)(x)=T(x+3)=2x+6.

These are different functions. Composition usually depends on order.

35.4 Matrix Form

Let

T:FnFm T:F^n\to F^m

have matrix AA, and let

S:FmFp S:F^m\to F^p

have matrix BB. Thus

T(x)=Ax T(x)=Ax

and

S(y)=By. S(y)=By.

Then

(ST)(x)=S(T(x))=S(Ax)=B(Ax). (S\circ T)(x)=S(T(x))=S(Ax)=B(Ax).

By associativity of matrix multiplication,

B(Ax)=(BA)x. B(Ax)=(BA)x.

Therefore the matrix of STS\circ T is

BA. BA.

The order is important. The matrix for the first transformation appears on the right. The matrix for the second transformation appears on the left.

If

AFm×n A \in F^{m\times n}

and

BFp×m, B \in F^{p\times m},

then

BAFp×n. BA \in F^{p\times n}.

This matches the map

ST:FnFp. S\circ T:F^n\to F^p.

35.5 Example: Two Transformations in the Plane

Let

T:R2R2 T:\mathbb{R}^2\to\mathbb{R}^2

be scaling by 22 in the xx-direction and by 33 in the yy-direction:

T[xy]=[2x3y]. T \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} 2x\\ 3y \end{bmatrix}.

Its matrix is

A=[2003]. A= \begin{bmatrix} 2 & 0\\ 0 & 3 \end{bmatrix}.

Let

S:R2R2 S:\mathbb{R}^2\to\mathbb{R}^2

be the shear

S[xy]=[x+yy]. S \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} x+y\\ y \end{bmatrix}.

Its matrix is

B=[1101]. B= \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix}.

The composition STS\circ T means scale first, then shear. Its matrix is

BA=[1101][2003]=[2303]. BA = \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} \begin{bmatrix} 2 & 0\\ 0 & 3 \end{bmatrix} = \begin{bmatrix} 2 & 3\\ 0 & 3 \end{bmatrix}.

Thus

(ST)[xy]=[2x+3y3y]. (S\circ T) \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} 2x+3y\\ 3y \end{bmatrix}.

Now reverse the order. The composition TST\circ S means shear first, then scale. Its matrix is

AB=[2003][1101]=[2203]. AB = \begin{bmatrix} 2 & 0\\ 0 & 3 \end{bmatrix} \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 2\\ 0 & 3 \end{bmatrix}.

Thus

(TS)[xy]=[2x+2y3y]. (T\circ S) \begin{bmatrix} x\\ y \end{bmatrix} = \begin{bmatrix} 2x+2y\\ 3y \end{bmatrix}.

The two results are different. Therefore

STTS S\circ T \neq T\circ S

in this example.

35.6 Noncommutativity

Composition of transformations is generally not commutative.

This means

ST S\circ T

and

TS T\circ S

may both be defined but may produce different transformations.

The same fact appears in matrix algebra as

BAAB BA \neq AB

in general.

There are special cases where two transformations commute. For example, two scalings along the same coordinate axes commute. Two rotations about the origin in the plane commute. A transformation always commutes with the identity transformation. But commutativity is a special property, not the default.

Because of this, the order of transformations must be recorded carefully. In geometry, “rotate then project” may differ from “project then rotate.” In computation, “normalize then transform” may differ from “transform then normalize.”

35.7 Associativity of Composition

Composition is associative.

If

R:WX,S:VW,T:UV, R:W\to X, \qquad S:V\to W, \qquad T:U\to V,

then

R(ST)=(RS)T. R\circ(S\circ T)=(R\circ S)\circ T.

Both sides send uUu\in U to

R(S(T(u))). R(S(T(u))).

Thus parentheses do not affect the final transformation, provided the order of the transformations remains the same.

In matrix form, if T,S,RT,S,R have matrices A,B,CA,B,C, respectively, then the matrix of

RST R\circ S\circ T

is

CBA. CBA.

Associativity of composition corresponds to associativity of matrix multiplication:

C(BA)=(CB)A. C(BA)=(CB)A.

This allows long chains of transformations to be grouped in convenient ways.

35.8 Identity Transformation

For every vector space VV, the identity transformation is the map

IV:VV I_V:V\to V

defined by

IV(v)=v. I_V(v)=v.

It is linear because

IV(u+v)=u+v=IV(u)+IV(v) I_V(u+v)=u+v=I_V(u)+I_V(v)

and

IV(cv)=cv=cIV(v). I_V(cv)=cv=cI_V(v).

If

T:UV T:U\to V

is linear, then

IVT=T I_V\circ T=T

and

TIU=T. T\circ I_U=T.

Thus identity transformations behave like multiplicative identities under composition.

In coordinates, the identity transformation on FnF^n has matrix

In=[100010001]. I_n= \begin{bmatrix} 1 & 0 & \cdots & 0\\ 0 & 1 & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & 1 \end{bmatrix}.

For any compatible matrix AA,

ImA=A I_mA=A

and

AIn=A. AI_n=A.

35.9 Inverses and Composition

A transformation T:VWT:V\to W is invertible if there exists a transformation

T1:WV T^{-1}:W\to V

such that

T1T=IV T^{-1}\circ T=I_V

and

TT1=IW. T\circ T^{-1}=I_W.

For linear transformations, an invertible linear transformation has a linear inverse.

If T(x)=AxT(x)=Ax, then TT is invertible exactly when AA is invertible. In that case,

T1(x)=A1x. T^{-1}(x)=A^{-1}x.

The equations

T1T=I T^{-1}\circ T=I

and

TT1=I T\circ T^{-1}=I

become

A1A=I A^{-1}A=I

and

AA1=I. AA^{-1}=I.

The inverse reverses the action of the original transformation.

35.10 Composition and Bases

Let

T:UV T:U\to V

and

S:VW S:V\to W

be linear. Choose ordered bases AA for UU, BB for VV, and CC for WW.

Then

[T]BA [T]_{B\leftarrow A}

is the matrix of TT, and

[S]CB [S]_{C\leftarrow B}

is the matrix of SS.

The composition

ST:UW S\circ T:U\to W

has matrix

[ST]CA=[S]CB[T]BA. [S\circ T]_{C\leftarrow A} = [S]_{C\leftarrow B} [T]_{B\leftarrow A}.

This formula is the coordinate version of composition.

The middle basis BB must match. The output coordinates of TT are expressed in basis BB, and these must be valid input coordinates for SS. If the bases do not match, a change-of-coordinates matrix is needed.

35.11 Example with Abstract Bases

Let P2P_2 be the vector space of polynomials of degree at most 22, and let P1P_1 be the vector space of polynomials of degree at most 11.

Define

D:P2P1 D:P_2\to P_1

by differentiation:

D(p)=p. D(p)=p'.

Define

E:P1R E:P_1\to \mathbb{R}

by evaluation at 11:

E(q)=q(1). E(q)=q(1).

Both transformations are linear.

Use the bases

B=(1,x,x2) B=(1,x,x^2)

for P2P_2,

C=(1,x) C=(1,x)

for P1P_1,

and

D0=(1) D_0=(1)

for R\mathbb{R}.

The matrix of differentiation is

[D]CB=[010002]. [D]_{C\leftarrow B} = \begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 2 \end{bmatrix}.

The matrix of evaluation at 11 is

[E]D0C=[11], [E]_{D_0\leftarrow C} = \begin{bmatrix} 1 & 1 \end{bmatrix},

since

E(a+bx)=a+b. E(a+bx)=a+b.

The composition

ED:P2R E\circ D:P_2\to\mathbb{R}

sends a polynomial to the value of its derivative at 11:

(ED)(p)=p(1). (E\circ D)(p)=p'(1).

Its matrix is

[ED]D0B=[E]D0C[D]CB. [E\circ D]_{D_0\leftarrow B} = [E]_{D_0\leftarrow C} [D]_{C\leftarrow B}.

Compute:

[11][010002]=[012]. \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & 2 \end{bmatrix} = \begin{bmatrix} 0 & 1 & 2 \end{bmatrix}.

Thus, for

p(x)=a+bx+cx2, p(x)=a+bx+cx^2,

we have

(ED)(p)=b+2c. (E\circ D)(p)=b+2c.

This agrees with direct computation:

p(x)=b+2cx, p'(x)=b+2cx,

so

p(1)=b+2c. p'(1)=b+2c.

35.12 Powers of a Linear Operator

When a linear transformation maps a vector space to itself,

T:VV, T:V\to V,

we may compose it with itself.

Define

T2=TT, T^2=T\circ T, T3=TTT, T^3=T\circ T\circ T,

and in general,

Tk=TTTk times. T^k=\underbrace{T\circ T\circ \cdots \circ T}_{k\text{ times}}.

Also define

T0=IV. T^0=I_V.

If TT has matrix AA, then TkT^k has matrix

Ak. A^k.

Powers of transformations occur in discrete dynamics, Markov chains, recurrence relations, graph walks, and iterative numerical methods.

For example, if

xk+1=Axk, x_{k+1}=Ax_k,

then

xk=Akx0. x_k=A^kx_0.

The behavior of the system depends on the powers of AA.

35.13 Nilpotent Transformations

A linear operator T:VVT:V\to V is nilpotent if some positive power of TT is the zero transformation:

Tk=0 T^k=0

for some k1k\geq 1.

For example, define

T:R3R3 T:\mathbb{R}^3\to\mathbb{R}^3

by

T[xyz]=[yz0]. T \begin{bmatrix} x\\ y\\ z \end{bmatrix} = \begin{bmatrix} y\\ z\\ 0 \end{bmatrix}.

Then

T2[xyz]=T[yz0]=[z00], T^2 \begin{bmatrix} x\\ y\\ z \end{bmatrix} = T \begin{bmatrix} y\\ z\\ 0 \end{bmatrix} = \begin{bmatrix} z\\ 0\\ 0 \end{bmatrix},

and

T3[xyz]=T[z00]=[000]. T^3 \begin{bmatrix} x\\ y\\ z \end{bmatrix} = T \begin{bmatrix} z\\ 0\\ 0 \end{bmatrix} = \begin{bmatrix} 0\\ 0\\ 0 \end{bmatrix}.

So

T3=0. T^3=0.

Nilpotent transformations are important in canonical forms, especially Jordan form.

35.14 Idempotent Transformations

A linear operator P:VVP:V\to V is idempotent if

P2=P. P^2=P.

This means applying PP twice gives the same result as applying it once.

Projections are the main examples. Let

P:R3R3 P:\mathbb{R}^3\to\mathbb{R}^3

be projection onto the xyxy-plane:

P[xyz]=[xy0]. P \begin{bmatrix} x\\ y\\ z \end{bmatrix} = \begin{bmatrix} x\\ y\\ 0 \end{bmatrix}.

Then

P2[xyz]=P[xy0]=[xy0]. P^2 \begin{bmatrix} x\\ y\\ z \end{bmatrix} = P \begin{bmatrix} x\\ y\\ 0 \end{bmatrix} = \begin{bmatrix} x\\ y\\ 0 \end{bmatrix}.

Thus

P2=P. P^2=P.

Once a vector has been projected onto the plane, projecting it again changes nothing.

35.15 Commuting Transformations

Two linear operators

S,T:VV S,T:V\to V

commute if

ST=TS. S\circ T=T\circ S.

In matrix form, if SS has matrix BB and TT has matrix AA, this condition becomes

BA=AB. BA=AB.

Commuting transformations can often be studied together. For example, if two diagonal matrices have the same size, they commute. If two transformations are diagonal with respect to the same basis, then they commute.

However, most pairs of matrices do not commute. The failure of commutativity is a structural feature of matrix algebra.

35.16 Composition and Kernel

Composition affects kernels in a predictable way.

Let

T:UV T:U\to V

and

S:VW S:V\to W

be linear. If uker(T)u\in\ker(T), then

T(u)=0. T(u)=0.

Therefore

(ST)(u)=S(T(u))=S(0)=0. (S\circ T)(u)=S(T(u))=S(0)=0.

So

ker(T)ker(ST). \ker(T)\subseteq \ker(S\circ T).

The composition may have a larger kernel than TT. This happens when T(u)T(u) is nonzero but lies in ker(S)\ker(S).

More precisely,

ker(ST)={uU:T(u)ker(S)}. \ker(S\circ T)=\{u\in U:T(u)\in\ker(S)\}.

Thus the kernel of a composition consists of all vectors that TT sends into the kernel of SS.

35.17 Composition and Image

The image of a composition is contained in the image of the second transformation.

For

ST:UW, S\circ T:U\to W,

every output has the form

S(T(u)). S(T(u)).

Since T(u)VT(u)\in V, every such output lies in

im(S). \operatorname{im}(S).

Therefore

im(ST)im(S). \operatorname{im}(S\circ T)\subseteq \operatorname{im}(S).

Also,

im(ST)=S(im(T)). \operatorname{im}(S\circ T)=S(\operatorname{im}(T)).

This means that the image of the composition is obtained by first taking the image of TT, then applying SS to that subspace.

Consequently,

rank(ST)rank(S) \operatorname{rank}(S\circ T)\leq \operatorname{rank}(S)

and

rank(ST)rank(T). \operatorname{rank}(S\circ T)\leq \operatorname{rank}(T).

Composition cannot have rank larger than either factor.

35.18 Invertible Factors

If T:UVT:U\to V and S:VWS:V\to W are both isomorphisms, then

ST:UW S\circ T:U\to W

is also an isomorphism.

Its inverse is

(ST)1=T1S1. (S\circ T)^{-1}=T^{-1}\circ S^{-1}.

The order reverses. To undo STS\circ T, one first undoes SS, then undoes TT.

In matrix form, if AA and BB are invertible, then

(BA)1=A1B1. (BA)^{-1}=A^{-1}B^{-1}.

Again, the order reverses.

35.19 Composition in Geometry

Many geometric operations are naturally described as compositions.

In the plane, a rigid motion fixing the origin may be built from rotations and reflections. A general linear deformation may be built from shears, scalings, rotations, and projections. In three dimensions, transformations used in graphics and robotics are often long products of elementary transformations.

For example, suppose a transformation first scales, then rotates. If the scaling matrix is AA and the rotation matrix is RR, then the total matrix is

RA. RA.

The rightmost matrix acts first.

This convention is essential in applications. Reversing the order usually gives a different geometric result.

35.20 Composition in Computation

In numerical computation, composition appears whenever data passes through a sequence of linear stages.

A signal may be filtered, transformed, compressed, and reconstructed. A vector may be projected into a subspace, multiplied by a smaller matrix, and mapped back. An iterative method may repeatedly apply the same operator.

If each stage is linear, then the entire pipeline is linear. Its total matrix is the product of the matrices for the stages, in the reverse order of application.

This gives a compact representation of a chain of linear operations. It also allows analysis of stability, rank, null spaces, eigenvalues, and error propagation.

35.21 Summary

Composition applies transformations in sequence. If

T:UV T:U\to V

and

S:VW, S:V\to W,

then

(ST)(u)=S(T(u)). (S\circ T)(u)=S(T(u)).

If SS and TT are linear, then STS\circ T is linear.

For matrix transformations,

T(x)=Ax,S(y)=By, T(x)=Ax, \qquad S(y)=By,

the composition has matrix

BA. BA.

Thus

(ST)(x)=(BA)x. (S\circ T)(x)=(BA)x.

Composition is associative, but it is generally not commutative. The identity transformation acts as the neutral element, and invertible transformations have inverses under composition.

Kernels and images behave naturally under composition:

ker(T)ker(ST), \ker(T)\subseteq \ker(S\circ T),

and

im(ST)im(S). \operatorname{im}(S\circ T)\subseteq \operatorname{im}(S).

Composition is therefore not a secondary operation. It explains why matrix multiplication has its form and why linear transformations form an algebraic system under chaining.