Chapter 87. Error Analysis

Error analysis studies how numerical error enters, propagates, and affects a computation. In numerical linear algebra, the exact problem is usually stated over real or complex numbers, while the computed problem is solved in floating point arithmetic. The difference between these two worlds is the source of numerical error.

The purpose of error analysis is not merely to say that an answer is inaccurate. It gives a method for estimating how inaccurate the answer may be, why the inaccuracy occurs, and which part of the computation controls it.

For linear algebra, error analysis is especially important because matrix algorithms contain many arithmetic operations. A small local rounding error may remain harmless, or it may be amplified by the structure of the problem or the algorithm. The main distinction is between error caused by the mathematical problem and error caused by the computational method. This distinction is usually expressed through conditioning, forward error, and backward error.

87.1 Sources of Error

Several kinds of error appear in numerical computation.

Source	Meaning
Data error	The input values are measured, estimated, rounded, or truncated
Rounding error	Real arithmetic is replaced by floating point arithmetic
Truncation error	An infinite or continuous process is replaced by a finite process
Discretization error	A continuous model is replaced by a finite-dimensional model
Iteration error	An iterative method is stopped before exact convergence
Modeling error	The mathematical model approximates a real system

In numerical linear algebra, the most common sources are data error, rounding error, and iteration error.

For example, when solving

Ax=b,

the matrix $A$ and vector $b$ may already contain measurement error. The elimination process then adds rounding error. If the system is solved iteratively, the method may stop before the exact solution is reached.

The computed answer therefore reflects several layers of approximation.

87.2 Exact Solution and Computed Solution

Let $x$ be the exact solution of a problem, and let $\hat{x}$ be the computed approximation.

The goal of error analysis is to compare $x$ and $\hat{x}$ .

In exact arithmetic, a direct method such as Gaussian elimination would produce the exact solution, assuming no division by zero and enough symbolic precision. In floating point arithmetic, each arithmetic operation is rounded. The computed solution is therefore the output of a nearby but different arithmetic process.

The basic question is:

\text{How close is } \hat{x} \text{ to } x?

This question leads to forward error.

87.3 Absolute Error

The absolute error is

\|x-\hat{x}\|.

It measures the direct distance between the exact solution and the computed solution.

For a scalar value, this becomes

|x-\hat{x}|.

Absolute error is useful when the physical scale of the problem is fixed.

For example, if a length is measured in meters, then an absolute error of $10^{-3}$ means an error of one millimeter.

But absolute error alone can be misleading. An error of $10^{-3}$ is small when the true value is $10^6$ , but large when the true value is $10^{-6}$ .

87.4 Relative Error

The relative error is

\frac{\|x-\hat{x}\|}{\|x\|}.

It measures error relative to the size of the exact solution.

For a scalar value, it becomes

\frac{|x-\hat{x}|}{|x|}.

Relative error is the standard measure in numerical linear algebra because it respects scale.

For example:

Exact value	Approximation	Absolute error	Relative error
$10^6$	$10^6+1$	$1$	$10^{-6}$
$10^{-6}$	$2\cdot 10^{-6}$	$10^{-6}$	$1$

The same absolute error may have very different meaning depending on the magnitude of the true value.

87.5 Forward Error

Forward error measures error in the output.

If a problem maps input data $d$ to solution $x=f(d)$ , and an algorithm returns $\hat{x}$ , then the forward error is

\|x-\hat{x}\|.

The relative forward error is

\frac{\|x-\hat{x}\|}{\|x\|}.

Forward error answers the most direct question: how close is the computed answer to the exact answer?

However, forward error is often difficult to evaluate directly because the exact answer is usually unknown. For this reason, forward error is more often bounded than directly computed.

87.6 Backward Error

Backward error asks a different question.

Instead of asking how far the computed answer is from the exact answer, it asks:

What nearby input data would make the computed answer exact?

For a problem $x=f(d)$ , the backward error is the smallest perturbation $\Delta d$ such that

\hat{x}=f(d+\Delta d).

For a linear system, this means finding small perturbations $\Delta A$ and $\Delta b$ such that

(A+\Delta A)\hat{x}=b+\Delta b.

Backward error is often easier to analyze than forward error because it treats the algorithm as exactly solving a nearby problem. This is the standard viewpoint in much of numerical linear algebra.

87.7 Residual Error

For the linear system

Ax=b,

the residual of an approximate solution $\hat{x}$ is

r=b-A\hat{x}.

The residual measures how much the computed vector fails to satisfy the equation.

If $r=0$ , then $\hat{x}$ exactly solves the original system.

If $r$ is small, then $\hat{x}$ solves the system nearly in the equation sense.

But a small residual does not always imply a small solution error. If $A$ is ill-conditioned, a tiny residual may correspond to a large forward error.

87.8 Residual and Error Equation

Let $x$ be the exact solution of

Ax=b.

Let $\hat{x}$ be an approximate solution, and define the error

e=x-\hat{x}.

Since $Ax=b$ , we have

A e = A(x-\hat{x}) = Ax-A\hat{x} = b-A\hat{x} = r.

Therefore,

Ae=r.

If $A$ is invertible, then

e=A^{-1}r.

This identity connects residual error to forward error.

Taking norms gives

\|e\| \le \|A^{-1}\|\|r\|.

Thus the residual is amplified by $\|A^{-1}\|$ .

87.9 Relative Error Bound for Linear Systems

For $Ax=b$ , the relative error satisfies an important bound.

Since

e=A^{-1}r,

we have

\|e\| \le \|A^{-1}\|\|r\|.

Also,

\|b\|=\|Ax\|\le \|A\|\|x\|.

Therefore,

\frac{\|e\|}{\|x\|} \le \|A\|\|A^{-1}\| \frac{\|r\|}{\|b\|}.

Using the condition number

\kappa(A)=\|A\|\|A^{-1}\|,

we get

\frac{\|x-\hat{x}\|}{\|x\|} \le \kappa(A) \frac{\|b-A\hat{x}\|}{\|b\|}.

This bound explains why residuals must be interpreted together with conditioning.

87.10 Error Propagation

Error propagation describes how local errors affect later quantities.

Suppose a computation consists of steps

x_0 \to x_1 \to x_2 \to \cdots \to x_k.

If each step introduces a perturbation, then the final error depends on how each step amplifies previous perturbations.

Some algorithms damp errors. Some preserve them. Some amplify them severely.

In matrix algorithms, amplification often occurs when:

Cause	Effect
Small pivots	large multipliers
Nearly dependent columns	unstable basis representation
Subtracting close quantities	cancellation
Squaring a condition number	loss of accuracy
Non-normal eigenvectors	sensitive eigenvalues

Error analysis identifies these mechanisms.

87.11 Local Error and Global Error

A local error is introduced at one step of an algorithm.

A global error is the final accumulated error after all steps.

An algorithm may have small local errors but large global error if the local errors are repeatedly amplified.

Iterative methods make this distinction especially clear. Each iteration introduces rounding error and leaves some iteration error. The final error is a combination of both effects.

For a convergent iteration,

x_{k+1}=Mx_k+c,

the error satisfies

e_{k+1}=Me_k.

If powers of $M$ shrink, error decreases. If powers of $M$ grow, error increases.

87.12 Rounding Error Model

Floating point arithmetic is commonly modeled by

\operatorname{fl}(a \circ b) = (a \circ b)(1+\delta), \qquad |\delta|\le u,

where:

Symbol	Meaning
$\operatorname{fl}$	computed floating point result
$\circ$	one arithmetic operation
$u$	unit roundoff
$\delta$	relative rounding error

For IEEE double precision, $u$ is approximately $10^{-16}$ .

This model allows error analysis to treat rounding as a sequence of small multiplicative perturbations.

87.13 Error in Dot Products

A dot product has the form

s=x^Ty=\sum_{i=1}^n x_i y_i.

Computing it requires $n$ multiplications and $n-1$ additions.

Each multiplication and addition introduces rounding error.

A typical bound has the form

|\widehat{s}-s| \le \gamma_n \sum_{i=1}^n |x_i||y_i|,

where

\gamma_n = \frac{nu}{1-nu},

provided $nu<1$ .

This shows that dot product error grows with the dimension and with cancellation in the sum.

87.14 Error in Matrix-Vector Products

A matrix-vector product computes

y=Ax.

Each component is a dot product:

y_i=\sum_{j=1}^n a_{ij}x_j.

Thus the error in $Ax$ is controlled by dot product error.

A standard componentwise form is

|\widehat{y}-y| \le \gamma_n |A||x|.

Here $|A|$ and $|x|$ denote entrywise absolute values.

This bound is useful because it tracks how signs and magnitudes influence cancellation.

87.15 Error in Matrix Multiplication

Matrix multiplication computes

C=AB.

Each entry is a dot product:

c_{ij}=\sum_{k=1}^n a_{ik}b_{kj}.

Therefore the computed product $\widehat{C}$ satisfies a bound of the form

|\widehat{C}-C| \le \gamma_n |A||B|.

Matrix multiplication is usually considered numerically stable in the backward sense. The main error grows predictably with dimension and unit roundoff.

87.16 Error in Gaussian Elimination

Gaussian elimination produces rounding errors during elimination and back substitution.

Without pivoting, small pivots may cause large multipliers, and large multipliers may amplify errors.

With partial pivoting, the algorithm is usually stable in practice. The computed solution $\hat{x}$ can often be interpreted as the exact solution of a nearby system

(A+\Delta A)\hat{x}=b.

For many practical matrices,

\frac{\|\Delta A\|}{\|A\|}

is on the order of machine precision, up to moderate factors depending on dimension and growth.

87.17 Growth Factor

The growth factor measures how large entries become during elimination.

If entries grow very large compared with entries of the original matrix, rounding error may be amplified.

For Gaussian elimination, the growth factor is often denoted by

\rho.

A simplified stability estimate has the form

\frac{\|\Delta A\|}{\|A\|} \lesssim \rho \, n \, u.

Thus partial pivoting helps by controlling pivots, but extreme element growth can still damage accuracy.

87.18 Error in Least Squares Problems

In least squares, the problem is

\min_x \|Ax-b\|_2.

Several algorithms may solve this problem.

Method	Numerical behavior
Normal equations	may lose accuracy
QR factorization	stable
SVD	most robust

The normal equations are

A^TAx=A^Tb.

They square the condition number:

\kappa(A^TA)=\kappa(A)^2.

Thus, if $A$ is ill-conditioned, normal equations may lose many digits of accuracy.

QR factorization avoids this squaring and is usually preferred.

87.19 Error in Eigenvalue Computations

Eigenvalue error depends strongly on matrix structure.

For symmetric or Hermitian matrices, eigenvalues are well-behaved under perturbation. A small perturbation in the matrix produces a small perturbation in eigenvalues.

For non-normal matrices, eigenvalues may be highly sensitive.

A computed eigenvalue may have a small residual

\|A\hat{v}-\hat{\lambda}\hat{v}\|

but still be sensitive if the eigenvalue is ill-conditioned.

Thus eigenvalue error analysis must consider both residuals and spectral conditioning.

87.20 Normwise and Componentwise Error

Normwise error measures the size of an error vector or matrix using a norm.

For example,

\frac{\|\Delta A\|}{\|A\|}.

Componentwise error measures each entry relative to its own scale:

\left|\frac{\Delta a_{ij}}{a_{ij}}\right|.

Normwise analysis is simpler and widely used.

Componentwise analysis may be sharper when matrix entries have very different magnitudes.

For sparse matrices, scaled systems, and structured problems, componentwise error often gives better practical information.

87.21 A Posteriori Error Estimates

An a posteriori estimate uses quantities computed after the approximate solution is available.

For a linear system, the residual gives a basic a posteriori estimate:

r=b-A\hat{x}.

If an estimate of $\|A^{-1}\|$ is available, then

\|x-\hat{x}\| \le \|A^{-1}\|\|r\|.

A posteriori estimates are useful because they do not require knowing the exact solution.

They are common in iterative solvers, finite element methods, and adaptive numerical algorithms.

87.22 A Priori Error Estimates

An a priori estimate predicts error before or during computation from known properties of the problem and algorithm.

For example, one may estimate that a backward stable algorithm for $Ax=b$ should produce relative forward error roughly bounded by

\kappa(A)u.

This estimate uses the condition number and unit roundoff, not the actual computed residual.

A priori estimates are useful for choosing algorithms and precision before computation begins.

87.23 Mixed Error Analysis

Mixed error analysis combines forward and backward viewpoints.

An algorithm may not be purely backward stable, but it may produce a result close to the exact solution of a nearby problem.

This kind of analysis is useful when strict backward stability is too strong or unavailable.

Many practical algorithms are understood through a mixture of:

Type	Measures
Forward analysis	output error
Backward analysis	input perturbation
Mixed analysis	both output and input perturbation

87.24 Error Bounds and Practical Accuracy

An error bound is often conservative.

Worst-case analysis protects against all possible rounding directions, but real rounding errors may partially cancel.

Thus theoretical bounds may overestimate actual error.

Still, error bounds are valuable because they identify the correct controlling quantities:

unit roundoff,
problem dimension,
condition number,
growth factor,
residual,
spectral gaps,
cancellation.

A good bound explains both when an algorithm should work and when it may fail.

87.25 Error Analysis Workflow

A typical error analysis follows these steps.

Step	Question
Define exact problem	What mathematical problem is being solved?
Define computed result	What does the algorithm actually return?
Identify local errors	Where does rounding enter?
Propagate errors	How are errors amplified?
Bound backward error	What nearby problem was solved?
Relate to conditioning	What forward error follows?
Interpret residuals	Does the output satisfy the equations?
State limits	What accuracy is realistically possible?

This workflow separates numerical facts from algorithmic artifacts.

87.26 Example: Residual Versus Solution Error

Consider a system

Ax=b

with

\kappa(A)=10^{10}.

Suppose a computed solution has relative residual

\frac{\|b-A\hat{x}\|}{\|b\|} = 10^{-14}.

The relative solution error may be bounded by approximately

10^{10}\cdot 10^{-14}=10^{-4}.

Thus the residual appears extremely small, but the solution may have only about four correct decimal digits.

This example shows why residuals alone are insufficient.

87.27 Error Analysis and Algorithm Design

Error analysis directly influences algorithm design.

For example:

Problem	Error-aware choice
Linear systems	use pivoting
Least squares	prefer QR or SVD over normal equations
Orthogonalization	prefer Householder or modified Gram-Schmidt
Eigenvalues	reduce by orthogonal similarity transformations
Summation	use pairwise or compensated summation
Scaling	equilibrate poorly scaled matrices

Stable algorithms are designed so that unavoidable rounding errors are not unnecessarily amplified.

87.28 Summary

Error analysis explains how computed answers differ from exact mathematical answers.

The main quantities are:

Concept	Meaning
Absolute error	direct size of error
Relative error	error measured against scale
Forward error	error in the computed output
Backward error	input perturbation that makes the output exact
Residual	failure to satisfy the original equation
Unit roundoff	basic scale of floating point rounding
Error bound	theoretical estimate of possible error

For linear systems, a central relation is

\frac{\|x-\hat{x}\|}{\|x\|} \le \kappa(A) \frac{\|b-A\hat{x}\|}{\|b\|}.

Error analysis connects floating point arithmetic, algorithm stability, and mathematical conditioning. It gives the language needed to explain why an algorithm succeeds, why it fails, and how much trust should be placed in its result.