Skip to content

Chapter 90. Gauss-Seidel Method

The Gauss-Seidel method is a stationary iterative method for solving a linear system

Ax=b. Ax=b.

It is closely related to the Jacobi method. Both methods solve each equation for one unknown and repeat the process. The difference is that Gauss-Seidel uses the newest available values immediately. When the method computes xi(k+1)x_i^{(k+1)}, it uses already updated values x1(k+1),,xi1(k+1)x_1^{(k+1)}, \ldots, x_{i-1}^{(k+1)} and old values xi+1(k),,xn(k)x_{i+1}^{(k)}, \ldots, x_n^{(k)}.

This small change often improves convergence. It also makes the method less naturally parallel than Jacobi.

Gauss-Seidel is guaranteed to converge for important matrix classes, including strictly diagonally dominant matrices and symmetric positive definite matrices. More generally, convergence is controlled by the spectral radius of its iteration matrix.

90.1 The Basic System

Let

A=[a11a12a1na21a22a2nan1an2ann],x=[x1x2xn],b=[b1b2bn]. A= \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}, \qquad x= \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix}, \qquad b= \begin{bmatrix} b_1\\ b_2\\ \vdots\\ b_n \end{bmatrix}.

The system

Ax=b Ax=b

means

ai1x1+ai2x2++ainxn=bi,i=1,,n. a_{i1}x_1+a_{i2}x_2+\cdots+a_{in}x_n=b_i, \qquad i=1,\ldots,n.

Assume each diagonal entry is nonzero:

aii0. a_{ii}\ne 0.

Solving the ii-th equation for xix_i gives

xi=1aii(bij<iaijxjj>iaijxj). x_i = \frac{1}{a_{ii}} \left( b_i-\sum_{j<i}a_{ij}x_j-\sum_{j>i}a_{ij}x_j \right).

The Gauss-Seidel method uses this identity as an update rule.

90.2 Gauss-Seidel Update Formula

Starting with an initial guess

x(0), x^{(0)},

the Gauss-Seidel method defines

xi(k+1)=1aii(bij<iaijxj(k+1)j>iaijxj(k)),i=1,,n. x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right), \qquad i=1,\ldots,n.

The first sum uses new values from the current iteration.

The second sum uses old values from the previous iteration.

This is the defining feature of Gauss-Seidel.

90.3 Comparison with Jacobi

Jacobi computes all new components from old components:

xi(k+1)=1aii(bijiaijxj(k)). x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i-\sum_{j\ne i}a_{ij}x_j^{(k)} \right).

Gauss-Seidel computes components sequentially and immediately reuses updated values:

xi(k+1)=1aii(bij<iaijxj(k+1)j>iaijxj(k)). x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right).
FeatureJacobiGauss-Seidel
Update stylesimultaneoussequential
Uses newest valuesnoyes
Needs separate new vectoryesno
Parallelismhighlower
Typical convergenceslowerfaster
Iteration matrixD1(L+U)-D^{-1}(L+U)(D+L)1U-(D+L)^{-1}U

Gauss-Seidel may be viewed as a Jacobi-like method with immediate feedback.

90.4 Example

Consider

4x1+x2=9,x1+3x2=7. \begin{aligned} 4x_1+x_2 &= 9,\\ x_1+3x_2 &= 7. \end{aligned}

Solving each equation for its diagonal variable gives

x1=9x24,x2=7x13. x_1=\frac{9-x_2}{4}, \qquad x_2=\frac{7-x_1}{3}.

The Gauss-Seidel iteration is

x1(k+1)=9x2(k)4, x_1^{(k+1)} = \frac{9-x_2^{(k)}}{4}, x2(k+1)=7x1(k+1)3. x_2^{(k+1)} = \frac{7-x_1^{(k+1)}}{3}.

Start with

x(0)=[00]. x^{(0)} = \begin{bmatrix} 0\\ 0 \end{bmatrix}.

First update:

x1(1)=904=2.25. x_1^{(1)}=\frac{9-0}{4}=2.25.

Then use this new value immediately:

x2(1)=72.253=1.5833. x_2^{(1)}=\frac{7-2.25}{3} = 1.5833\ldots.

Thus

x(1)=[2.251.5833]. x^{(1)} = \begin{bmatrix} 2.25\\ 1.5833 \end{bmatrix}.

Second update:

x1(2)=91.58334=1.8542, x_1^{(2)} = \frac{9-1.5833}{4} = 1.8542\ldots, x2(2)=71.85423=1.7153. x_2^{(2)} = \frac{7-1.8542}{3} = 1.7153\ldots.

The exact solution is

x=[25/3]. x= \begin{bmatrix} 2\\ 5/3 \end{bmatrix}.

The iterates move toward the exact solution.

90.5 Matrix Splitting

Write

A=D+L+U, A=D+L+U,

where:

SymbolMeaning
DDdiagonal part of AA
LLstrictly lower triangular part of AA
UUstrictly upper triangular part of AA

Then

Ax=b Ax=b

becomes

(D+L+U)x=b. (D+L+U)x=b.

Move the strictly upper triangular part to the right:

(D+L)x=Ux+b. (D+L)x=-Ux+b.

The Gauss-Seidel iteration is

(D+L)x(k+1)=Ux(k)+b. (D+L)x^{(k+1)} = -Ux^{(k)}+b.

Since D+LD+L is lower triangular, each iteration requires one triangular solve.

90.6 Fixed-Point Form

The iteration may be written as

x(k+1)=(D+L)1Ux(k)+(D+L)1b. x^{(k+1)} = -(D+L)^{-1}Ux^{(k)} + (D+L)^{-1}b.

Thus the iteration matrix is

GGS=(D+L)1U. G_{GS}=-(D+L)^{-1}U.

The fixed-point form is

x(k+1)=GGSx(k)+c, x^{(k+1)}=G_{GS}x^{(k)}+c,

where

c=(D+L)1b. c=(D+L)^{-1}b.

If the sequence converges to a vector xx, then

x=GGSx+c. x=G_{GS}x+c.

Multiplying by D+LD+L, we recover

(D+L+U)x=b. (D+L+U)x=b.

Hence the limit solves the original system.

90.7 Error Recurrence

Let xx be the exact solution. Since

x=GGSx+c x=G_{GS}x+c

and

x(k+1)=GGSx(k)+c, x^{(k+1)}=G_{GS}x^{(k)}+c,

subtracting gives

xx(k+1)=GGS(xx(k)). x-x^{(k+1)} = G_{GS}(x-x^{(k)}).

With

e(k)=xx(k), e^{(k)}=x-x^{(k)},

we obtain

e(k+1)=GGSe(k). e^{(k+1)}=G_{GS}e^{(k)}.

Therefore,

e(k)=GGSke(0). e^{(k)}=G_{GS}^{k}e^{(0)}.

Convergence depends on whether powers of GGSG_{GS} approach zero.

90.8 Convergence Criterion

Gauss-Seidel converges for every initial guess if and only if

ρ(GGS)<1, \rho(G_{GS})<1,

where ρ(GGS)\rho(G_{GS}) is the spectral radius of the iteration matrix. For a general splitting A=MNA=M-N, the corresponding stationary iteration converges when the spectral radius of M1NM^{-1}N is less than one.

For Gauss-Seidel,

M=D+L,N=U, M=D+L, \qquad N=-U,

so

M1N=(D+L)1U. M^{-1}N=-(D+L)^{-1}U.

Thus the convergence condition becomes

ρ((D+L)1U)<1. \rho\left(-(D+L)^{-1}U\right)<1.

90.9 Strict Diagonal Dominance

A matrix is strictly row diagonally dominant if

$$ |a_{ii}|

\sum_{j\ne i}|a_{ij}| $$

for every row ii.

If AA is strictly diagonally dominant, then the Gauss-Seidel method converges. This is a standard sufficient condition.

Strict diagonal dominance means that each equation is controlled mainly by its own variable. The off-diagonal coupling is not strong enough to prevent contraction of the iteration error.

This condition is sufficient, but not necessary. Gauss-Seidel may converge even when strict diagonal dominance fails.

90.10 Symmetric Positive Definite Matrices

Another important convergence result applies to symmetric positive definite matrices.

If AA is symmetric positive definite, then Gauss-Seidel converges.

This class appears frequently in applications, including:

SourceMatrix type
finite difference discretizationssparse SPD matrices
finite element methodssparse SPD matrices
quadratic minimizationHessian matrices
graph Laplacians with constraintspositive definite reductions

The SPD case is especially important because it also supports conjugate gradient methods.

90.11 Geometric Interpretation

Gauss-Seidel may be interpreted as solving one coordinate direction at a time.

At each step, the method adjusts xix_i so that the ii-th equation is satisfied using the newest known values.

After updating x1x_1, the first equation is satisfied relative to the current state. After updating x2x_2, the second equation is satisfied relative to the updated x1x_1, and so on.

The method sweeps through the equations. Each sweep reduces error when the coupling structure is favorable.

90.12 Residual Form

The residual at step kk is

r(k)=bAx(k). r^{(k)}=b-Ax^{(k)}.

Gauss-Seidel can be interpreted as applying a lower-triangular correction.

From

(D+L)x(k+1)=Ux(k)+b, (D+L)x^{(k+1)} = -Ux^{(k)}+b,

subtract

(D+L)x(k) (D+L)x^{(k)}

from both sides:

(D+L)(x(k+1)x(k))=b(D+L+U)x(k). (D+L)(x^{(k+1)}-x^{(k)}) = b-(D+L+U)x^{(k)}.

Thus

(D+L)(x(k+1)x(k))=r(k). (D+L)(x^{(k+1)}-x^{(k)}) = r^{(k)}.

Therefore,

x(k+1)=x(k)+(D+L)1r(k). x^{(k+1)} = x^{(k)}+(D+L)^{-1}r^{(k)}.

Gauss-Seidel updates the current approximation by applying a lower-triangular solve to the residual.

90.13 Algorithm

Input:

A,b,x(0),τ,kmax. A,\quad b,\quad x^{(0)},\quad \tau,\quad k_{\max}.

For

k=0,1,,kmax1: k=0,1,\ldots,k_{\max}-1:

update each component in order:

xi1aii(bij<iaijxjj>iaijxj). x_i \leftarrow \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j - \sum_{j>i}a_{ij}x_j \right).

Here the same vector xx is overwritten in place.

After a full sweep, compute the residual

r=bAx. r=b-Ax.

Stop if

rbτ. \frac{\|r\|}{\|b\|} \le \tau.

Return the current xx.

90.14 Pseudocode

gauss_seidel(A, b, x, tol, max_iter):
    n = length(b)

    for k = 0 to max_iter - 1:
        for i = 1 to n:
            s1 = 0
            s2 = 0

            for j = 1 to i - 1:
                s1 = s1 + A[i,j] * x[j]

            for j = i + 1 to n:
                s2 = s2 + A[i,j] * x[j]

            x[i] = (b[i] - s1 - s2) / A[i,i]

        r = b - A * x

        if norm(r) / norm(b) <= tol:
            return x

    return x

Unlike Jacobi, the method updates xx in place.

This is why Gauss-Seidel needs less vector storage but has less natural parallelism.

90.15 Storage Requirements

Gauss-Seidel needs:

ObjectStorage
Matrix AAproblem-dependent
Right-hand side bbnn
Current vector xxnn
Residual, optionalnn

Jacobi usually stores both x(k)x^{(k)} and x(k+1)x^{(k+1)}.

Gauss-Seidel can overwrite xx, so it often needs one fewer vector.

For very large sparse systems, this memory difference may matter.

90.16 Computational Cost

For a dense n×nn\times n matrix, one sweep costs

O(n2). O(n^2).

For a sparse matrix, one sweep costs approximately

O(nnz(A)). O(\operatorname{nnz}(A)).

The cost per iteration is similar to Jacobi.

The difference lies in convergence and parallelism.

Gauss-Seidel usually reduces error faster per sweep, but Jacobi is easier to parallelize.

90.17 Sparse Implementation

For a sparse matrix stored by rows, one Gauss-Seidel update has the form:

for i = 1 to n:
    s = 0
    diag = 0

    for each nonzero (j, aij) in row i:
        if j == i:
            diag = aij
        else:
            s = s + aij * x[j]

    x[i] = (b[i] - s) / diag

This works because values x[j]x[j] for j<ij<i have already been updated during the current sweep, while values for j>ij>i still come from the previous sweep.

The ordering of unknowns therefore affects the method.

90.18 Dependence on Ordering

Gauss-Seidel depends on the order in which variables are updated.

Changing the order changes the splitting of AA into lower and upper parts.

For some matrices, one ordering may converge faster than another.

Ordering matters especially for sparse matrices from grids or graphs.

Common orderings include:

OrderingUse
natural orderingsimplest implementation
red-black orderingexposes parallelism on grids
bandwidth-reducing orderingimproves locality
graph coloringparallel Gauss-Seidel variants

This dependence is both a weakness and a tool.

90.19 Parallel Gauss-Seidel

Basic Gauss-Seidel is sequential because each update depends on previous updates in the same sweep.

Parallel versions are possible when variables can be grouped so that updates within a group do not depend on each other.

For example, red-black Gauss-Seidel splits grid points into two sets. All red points are updated first, then all black points.

Graph coloring generalizes this idea.

If two variables are not coupled by a nonzero matrix entry, they may be updated simultaneously.

90.20 Symmetric Gauss-Seidel

Symmetric Gauss-Seidel performs two sweeps:

  1. a forward sweep,
  2. a backward sweep.

The forward sweep uses the usual order:

1,2,,n. 1,2,\ldots,n.

The backward sweep uses the reverse order:

n,n1,,1. n,n-1,\ldots,1.

This produces a more symmetric operation.

Symmetric Gauss-Seidel is often used as a preconditioner, especially for symmetric systems.

90.21 Gauss-Seidel as Preconditioner

From the residual form,

x(k+1)=x(k)+(D+L)1r(k). x^{(k+1)} = x^{(k)}+(D+L)^{-1}r^{(k)}.

This suggests the preconditioner

P=D+L. P=D+L.

Applying the preconditioner means solving a lower triangular system.

This is more expensive than Jacobi preconditioning but may be stronger because it includes lower-triangular coupling.

For symmetric methods, symmetric Gauss-Seidel or SSOR is often preferred.

90.22 Relation to Coordinate Descent

For symmetric positive definite AA, solving

Ax=b Ax=b

is equivalent to minimizing the quadratic function

ϕ(x)=12xTAxbTx. \phi(x)=\frac{1}{2}x^TAx-b^Tx.

Gauss-Seidel may be interpreted as exact coordinate minimization of this quadratic.

Each update chooses one coordinate xix_i while holding the others fixed, minimizing ϕ\phi along that coordinate direction.

This interpretation explains why Gauss-Seidel converges for symmetric positive definite systems.

90.23 Failure Modes

Gauss-Seidel may fail or perform poorly for several reasons.

Failure modeCause
Divergenceρ(GGS)1\rho(G_{GS})\ge 1
Slow convergencespectral radius close to 11
Division by zerozero diagonal entry
Poor orderingunfavorable triangular splitting
Stagnationill-conditioning or weak smoothing
Misleading residualill-conditioned matrix

A robust implementation should check diagonal entries and enforce a maximum iteration count.

90.24 Residual Versus Error

As with all iterative solvers, the residual does not directly equal the error.

For

r(k)=bAx(k) r^{(k)}=b-Ax^{(k)}

and

e(k)=xx(k), e^{(k)}=x-x^{(k)},

we have

Ae(k)=r(k). Ae^{(k)}=r^{(k)}.

If AA is invertible,

e(k)=A1r(k). e^{(k)}=A^{-1}r^{(k)}.

Thus

e(k)A1r(k). \|e^{(k)}\| \le \|A^{-1}\|\|r^{(k)}\|.

If AA is ill-conditioned, a small residual may still correspond to a significant solution error.

90.25 Practical Use

Gauss-Seidel is simple, but it is rarely the best standalone method for large difficult systems.

It remains useful because it is:

RoleReason
teaching methodshows sequential stationary iteration
smootheruseful in multigrid
preconditionercheap triangular correction
baseline solversimple reference method
local relaxation methodnatural for grid equations

In modern large-scale linear algebra, Gauss-Seidel is often used inside larger algorithms rather than as the final solver.

90.26 Summary

The Gauss-Seidel method solves

Ax=b Ax=b

by the iteration

xi(k+1)=1aii(bij<iaijxj(k+1)j>iaijxj(k)). x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right).

In matrix form,

(D+L)x(k+1)=Ux(k)+b. (D+L)x^{(k+1)}=-Ux^{(k)}+b.

The iteration matrix is

GGS=(D+L)1U. G_{GS}=-(D+L)^{-1}U.

The method converges for every initial guess exactly when

ρ(GGS)<1. \rho(G_{GS})<1.

Strict diagonal dominance and symmetric positive definiteness are important sufficient conditions for convergence. Gauss-Seidel often converges faster than Jacobi, but it is more sequential and depends on variable ordering.