Chapter 6. Matrices

A matrix is a rectangular array of entries arranged in rows and columns. In linear algebra, the entries are usually scalars from a field, most often $\mathbb{R}$ or $\mathbb{C}$ . Matrices are used to store coefficients, write systems of equations compactly, represent linear transformations, and organize numerical data. The row-column structure is essential because it supports matrix-vector multiplication and matrix multiplication.

6.1 Definition of a Matrix

An $m\times n$ matrix is a rectangular array with $m$ rows and $n$ columns:

A= \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{bmatrix}.

The entry in row $i$ and column $j$ is written as

a_{ij}.

The first index records the row. The second index records the column.

For example,

A= \begin{bmatrix} 2&-1&4\\ 0&3&5 \end{bmatrix}

is a $2\times 3$ matrix. It has two rows and three columns.

6.2 Rows and Columns

The rows of a matrix are horizontal lists. The columns are vertical lists.

For

A= \begin{bmatrix} 2&-1&4\\ 0&3&5 \end{bmatrix},

the rows are

\begin{bmatrix} 2&-1&4 \end{bmatrix}

and

\begin{bmatrix} 0&3&5 \end{bmatrix}.

The columns are

\begin{bmatrix} 2\\ 0 \end{bmatrix}, \qquad \begin{bmatrix} -1\\ 3 \end{bmatrix}, \qquad \begin{bmatrix} 4\\ 5 \end{bmatrix}.

Rows and columns give two different ways to read the same matrix. Row operations act on equations. Column combinations describe the action of the matrix on vectors.

6.3 Size and Shape

The size of a matrix is written as

m\times n.

The number $m$ is the number of rows. The number $n$ is the number of columns.

Matrix shape	Name
$m\times n$	General rectangular matrix
$n\times n$	Square matrix
$1\times n$	Row matrix
$m\times 1$	Column matrix
$1\times 1$	Single-entry matrix

A column vector is often treated as an $n\times 1$ matrix:

x= \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix}.

A row vector is a $1\times n$ matrix:

r= \begin{bmatrix} r_1&r_2&\cdots&r_n \end{bmatrix}.

6.4 Matrix Equality

Two matrices are equal if they have the same size and the same entries in corresponding positions.

If $A=(a_{ij})$ and $B=(b_{ij})$ , then

A=B

means

a_{ij}=b_{ij}

for every row $i$ and every column $j$ .

For example,

\begin{bmatrix} 1&2\\ 3&4 \end{bmatrix} = \begin{bmatrix} 1&2\\ 3&4 \end{bmatrix}.

But

\begin{bmatrix} 1&2\\ 3&4 \end{bmatrix} \ne \begin{bmatrix} 1&3\\ 2&4 \end{bmatrix}.

The entries are the same numbers, but they occur in different positions.

6.5 Zero Matrices

A zero matrix is a matrix whose entries are all zero.

The $m\times n$ zero matrix is written as

0_{m\times n}

or simply $0$ when the size is clear.

For example,

0_{2\times 3} = \begin{bmatrix} 0&0&0\\ 0&0&0 \end{bmatrix}.

The zero matrix is the additive identity for matrices of the same size. If $A$ is an $m\times n$ matrix, then

A+0_{m\times n}=A.

6.6 Matrix Addition

Matrices of the same size can be added entry by entry.

A=(a_{ij})

and

B=(b_{ij})

are both $m\times n$ matrices, then

A+B=(a_{ij}+b_{ij}).

For example,

\begin{bmatrix} 1&2\\ 3&4 \end{bmatrix} + \begin{bmatrix} 5&6\\ 7&8 \end{bmatrix} = \begin{bmatrix} 6&8\\ 10&12 \end{bmatrix}.

Matrices of different sizes cannot be added.

6.7 Scalar Multiplication

A scalar multiplies a matrix by multiplying every entry.

If $c$ is a scalar and $A=(a_{ij})$ , then

cA=(ca_{ij}).

For example,

3 \begin{bmatrix} 1&-2\\ 0&4 \end{bmatrix} = \begin{bmatrix} 3&-6\\ 0&12 \end{bmatrix}.

Scalar multiplication preserves the shape of the matrix.

6.8 Matrix Subtraction

Matrix subtraction is defined using additive inverses.

If $A$ and $B$ have the same size, then

A-B=A+(-1)B.

For example,

\begin{bmatrix} 7&5\\ 3&1 \end{bmatrix} - \begin{bmatrix} 2&4\\ 1&6 \end{bmatrix} = \begin{bmatrix} 5&1\\ 2&-5 \end{bmatrix}.

Subtraction is entrywise, just like addition.

6.9 Matrix-Vector Multiplication

Let $A$ be an $m\times n$ matrix and let $x\in F^n$ . Then $Ax$ is a vector in $F^m$ .

A= \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n}\\ a_{21}&a_{22}&\cdots&a_{2n}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{bmatrix}

and

x= \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix},

then

Ax= \begin{bmatrix} a_{11}x_1+a_{12}x_2+\cdots+a_{1n}x_n\\ a_{21}x_1+a_{22}x_2+\cdots+a_{2n}x_n\\ \vdots\\ a_{m1}x_1+a_{m2}x_2+\cdots+a_{mn}x_n \end{bmatrix}.

Each entry of $Ax$ is the dot product of one row of $A$ with the vector $x$ .

For example,

\begin{bmatrix} 2&1\\ 3&-1 \end{bmatrix} \begin{bmatrix} 4\\ 5 \end{bmatrix} = \begin{bmatrix} 2\cdot4+1\cdot5\\ 3\cdot4+(-1)\cdot5 \end{bmatrix} = \begin{bmatrix} 13\\ 7 \end{bmatrix}.

6.10 Column Combination View

Matrix-vector multiplication can also be read by columns.

A= \begin{bmatrix} |&|&&|\\ a_1&a_2&\cdots&a_n\\ |&|&&| \end{bmatrix},

where $a_1,\ldots,a_n$ are the columns of $A$ , then

Ax=x_1a_1+x_2a_2+\cdots+x_na_n.

Thus $Ax$ is a linear combination of the columns of $A$ .

For example,

\begin{bmatrix} 1&3\\ 2&4 \end{bmatrix} \begin{bmatrix} 5\\ 6 \end{bmatrix} = 5 \begin{bmatrix} 1\\ 2 \end{bmatrix} + 6 \begin{bmatrix} 3\\ 4 \end{bmatrix} = \begin{bmatrix} 23\\ 34 \end{bmatrix}.

This viewpoint is central. The equation

Ax=b

asks whether $b$ can be written as a linear combination of the columns of $A$ .

6.11 Matrices as Coefficient Arrays

A system of linear equations can be encoded by a matrix.

The system

\begin{aligned} 2x-y+3z&=5,\\ x+4y-z&=2 \end{aligned}

has coefficient matrix

A= \begin{bmatrix} 2&-1&3\\ 1&4&-1 \end{bmatrix},

unknown vector

x= \begin{bmatrix} x\\ y\\ z \end{bmatrix},

and right-hand side vector

b= \begin{bmatrix} 5\\ 2 \end{bmatrix}.

The system is written as

Ax=b.

The matrix stores the coefficients. The vector multiplication reconstructs the left-hand sides of the equations.

6.12 Square Matrices

A square matrix has the same number of rows and columns. An $n\times n$ matrix is called a square matrix of order $n$ . Square matrices play a special role because they can represent transformations from $F^n$ to itself, and because operations such as determinant, trace, eigenvalues, and invertibility are defined for them.

For example,

\begin{bmatrix} 1&2\\ 3&4 \end{bmatrix}

is a $2\times 2$ square matrix.

The matrix

\begin{bmatrix} 1&2&3\\ 4&5&6 \end{bmatrix}

is not square.

6.13 Diagonal Entries

In a square matrix, the main diagonal consists of the entries

a_{11},a_{22},\ldots,a_{nn}.

For example, in

A= \begin{bmatrix} 2&5&1\\ 0&-3&4\\ 7&6&9 \end{bmatrix},

the diagonal entries are

2,\quad -3,\quad 9.

The diagonal is important because many special matrices are defined by conditions on entries off the diagonal or on the diagonal itself.

6.14 Diagonal Matrices

A diagonal matrix is a square matrix whose off-diagonal entries are all zero.

For example,

D= \begin{bmatrix} 2&0&0\\ 0&-1&0\\ 0&0&5 \end{bmatrix}

is diagonal.

Multiplying by a diagonal matrix scales coordinates independently. If

D= \begin{bmatrix} d_1&0&\cdots&0\\ 0&d_2&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&d_n \end{bmatrix},

then

Dx= \begin{bmatrix} d_1x_1\\ d_2x_2\\ \vdots\\ d_nx_n \end{bmatrix}.

Diagonal matrices are among the simplest matrices to understand and compute with.

6.15 Identity Matrix

The $n\times n$ identity matrix is the diagonal matrix with all diagonal entries equal to $1$ :

I_n= \begin{bmatrix} 1&0&\cdots&0\\ 0&1&\cdots&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&1 \end{bmatrix}.

It satisfies

I_nx=x

for every $x\in F^n$ .

The identity matrix is the matrix form of the identity transformation. It leaves every vector unchanged.

6.16 Triangular Matrices

A square matrix is upper triangular if all entries below the main diagonal are zero.

For example,

\begin{bmatrix} 2&1&4\\ 0&3&-1\\ 0&0&5 \end{bmatrix}

is upper triangular.

A square matrix is lower triangular if all entries above the main diagonal are zero.

For example,

\begin{bmatrix} 2&0&0\\ 1&3&0\\ 4&-1&5 \end{bmatrix}

is lower triangular.

Triangular matrices are important because systems involving them are easy to solve by substitution.

6.17 Transpose

The transpose of a matrix is obtained by turning rows into columns.

If $A=(a_{ij})$ , then the transpose $A^T$ is defined by

(A^T)_{ij}=a_{ji}.

For example,

A= \begin{bmatrix} 1&2&3\\ 4&5&6 \end{bmatrix}

has transpose

A^T= \begin{bmatrix} 1&4\\ 2&5\\ 3&6 \end{bmatrix}.

If $A$ is $m\times n$ , then $A^T$ is $n\times m$ .

6.18 Symmetric Matrices

A square matrix is symmetric if

A^T=A.

For example,

\begin{bmatrix} 2&3&-1\\ 3&5&4\\ -1&4&0 \end{bmatrix}

is symmetric.

Symmetric matrices occur throughout geometry, optimization, statistics, and spectral theory. They have especially strong eigenvalue properties, studied later in the book.

6.19 Sparse Matrices

A sparse matrix is a matrix whose entries are mostly zero. Sparse matrices occur in graphs, finite difference methods, finite element methods, optimization, and large-scale data problems. Algorithms often exploit sparsity to reduce memory use and computation.

For example,

\begin{bmatrix} 0&0&5&0\\ 0&0&0&0\\ 3&0&0&0\\ 0&0&0&7 \end{bmatrix}

is sparse.

A dense matrix has many nonzero entries. Dense and sparse matrices may represent the same kind of mathematical object, but they require different computational methods.

6.20 Matrices as Linear Transformations

Every $m\times n$ matrix $A$ defines a function

T:F^n\to F^m

T(x)=Ax.

This function is linear because

A(u+v)=Au+Av

and

A(cv)=cAv.

Thus matrices represent linear transformations. This is one of their most important roles.

For example,

A= \begin{bmatrix} 2&0\\ 0&1 \end{bmatrix}

maps

\begin{bmatrix} x\\ y \end{bmatrix}

\begin{bmatrix} 2x\\ y \end{bmatrix}.

Geometrically, it stretches the plane in the first coordinate direction.

6.21 Data Matrices

Matrices also organize data.

A data matrix may have observations as rows and features as columns:

X= \begin{bmatrix} \text{observation 1}\\ \text{observation 2}\\ \vdots\\ \text{observation m} \end{bmatrix}.

For example,

X= \begin{bmatrix} 170&65\\ 180&80\\ 160&55 \end{bmatrix}

might record height and weight for three observations.

In this setting, matrix methods are used to transform, compress, compare, and model data. Least squares, covariance matrices, principal component analysis, and many machine learning algorithms use this view.

6.22 Summary

A matrix is a rectangular array of scalars. Its entries are indexed by row and column. Its size is written $m\times n$ . Matrices can be added, subtracted, and scaled entry by entry when their sizes match.

The main concepts are:

Concept	Meaning
Entry $a_{ij}$	Entry in row $i$ , column $j$
Row	Horizontal list of entries
Column	Vertical list of entries
Size $m\times n$	$m$ rows and $n$ columns
Zero matrix	Matrix with all entries zero
Square matrix	Matrix with equal rows and columns
Diagonal matrix	Square matrix with zero off-diagonal entries
Identity matrix	Matrix representing the identity transformation
Transpose	Matrix obtained by interchanging rows and columns
Sparse matrix	Matrix with mostly zero entries

Matrices have several simultaneous meanings. They are arrays of numbers, coefficient tables for systems of equations, rules for transforming vectors, and structured containers for data. Subsequent chapters develop their operations in detail.