Chapter 90. Gauss-Seidel Method

The Gauss-Seidel method is a stationary iterative method for solving a linear system

Ax=b.

It is closely related to the Jacobi method. Both methods solve each equation for one unknown and repeat the process. The difference is that Gauss-Seidel uses the newest available values immediately. When the method computes $x_i^{(k+1)}$ , it uses already updated values $x_1^{(k+1)}, \ldots, x_{i-1}^{(k+1)}$ and old values $x_{i+1}^{(k)}, \ldots, x_n^{(k)}$ .

This small change often improves convergence. It also makes the method less naturally parallel than Jacobi.

Gauss-Seidel is guaranteed to converge for important matrix classes, including strictly diagonally dominant matrices and symmetric positive definite matrices. More generally, convergence is controlled by the spectral radius of its iteration matrix.

90.1 The Basic System

Let

A= \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22} & \cdots & a_{2n}\\ \vdots & \vdots & \ddots & \vdots\\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}, \qquad x= \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix}, \qquad b= \begin{bmatrix} b_1\\ b_2\\ \vdots\\ b_n \end{bmatrix}.

The system

Ax=b

means

a_{i1}x_1+a_{i2}x_2+\cdots+a_{in}x_n=b_i, \qquad i=1,\ldots,n.

Assume each diagonal entry is nonzero:

a_{ii}\ne 0.

Solving the $i$ -th equation for $x_i$ gives

x_i = \frac{1}{a_{ii}} \left( b_i-\sum_{j<i}a_{ij}x_j-\sum_{j>i}a_{ij}x_j \right).

The Gauss-Seidel method uses this identity as an update rule.

90.2 Gauss-Seidel Update Formula

Starting with an initial guess

x^{(0)},

the Gauss-Seidel method defines

x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right), \qquad i=1,\ldots,n.

The first sum uses new values from the current iteration.

The second sum uses old values from the previous iteration.

This is the defining feature of Gauss-Seidel.

90.3 Comparison with Jacobi

Jacobi computes all new components from old components:

x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i-\sum_{j\ne i}a_{ij}x_j^{(k)} \right).

Gauss-Seidel computes components sequentially and immediately reuses updated values:

x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right).

Feature	Jacobi	Gauss-Seidel
Update style	simultaneous	sequential
Uses newest values	no	yes
Needs separate new vector	yes	no
Parallelism	high	lower
Typical convergence	slower	faster
Iteration matrix	$-D^{-1}(L+U)$	$-(D+L)^{-1}U$

Gauss-Seidel may be viewed as a Jacobi-like method with immediate feedback.

90.4 Example

Consider

\begin{aligned} 4x_1+x_2 &= 9,\\ x_1+3x_2 &= 7. \end{aligned}

Solving each equation for its diagonal variable gives

x_1=\frac{9-x_2}{4}, \qquad x_2=\frac{7-x_1}{3}.

The Gauss-Seidel iteration is

x_1^{(k+1)} = \frac{9-x_2^{(k)}}{4},

x_2^{(k+1)} = \frac{7-x_1^{(k+1)}}{3}.

Start with

x^{(0)} = \begin{bmatrix} 0\\ 0 \end{bmatrix}.

First update:

x_1^{(1)}=\frac{9-0}{4}=2.25.

Then use this new value immediately:

x_2^{(1)}=\frac{7-2.25}{3} = 1.5833\ldots.

Thus

x^{(1)} = \begin{bmatrix} 2.25\\ 1.5833 \end{bmatrix}.

Second update:

x_1^{(2)} = \frac{9-1.5833}{4} = 1.8542\ldots,

x_2^{(2)} = \frac{7-1.8542}{3} = 1.7153\ldots.

The exact solution is

x= \begin{bmatrix} 2\\ 5/3 \end{bmatrix}.

The iterates move toward the exact solution.

90.5 Matrix Splitting

Write

A=D+L+U,

where:

Symbol	Meaning
$D$	diagonal part of $A$
$L$	strictly lower triangular part of $A$
$U$	strictly upper triangular part of $A$

Then

Ax=b

becomes

(D+L+U)x=b.

Move the strictly upper triangular part to the right:

(D+L)x=-Ux+b.

The Gauss-Seidel iteration is

(D+L)x^{(k+1)} = -Ux^{(k)}+b.

Since $D+L$ is lower triangular, each iteration requires one triangular solve.

90.6 Fixed-Point Form

The iteration may be written as

x^{(k+1)} = -(D+L)^{-1}Ux^{(k)} + (D+L)^{-1}b.

Thus the iteration matrix is

G_{GS}=-(D+L)^{-1}U.

The fixed-point form is

x^{(k+1)}=G_{GS}x^{(k)}+c,

where

c=(D+L)^{-1}b.

If the sequence converges to a vector $x$ , then

x=G_{GS}x+c.

Multiplying by $D+L$ , we recover

(D+L+U)x=b.

Hence the limit solves the original system.

90.7 Error Recurrence

Let $x$ be the exact solution. Since

x=G_{GS}x+c

and

x^{(k+1)}=G_{GS}x^{(k)}+c,

subtracting gives

x-x^{(k+1)} = G_{GS}(x-x^{(k)}).

With

e^{(k)}=x-x^{(k)},

we obtain

e^{(k+1)}=G_{GS}e^{(k)}.

Therefore,

e^{(k)}=G_{GS}^{k}e^{(0)}.

Convergence depends on whether powers of $G_{GS}$ approach zero.

90.8 Convergence Criterion

Gauss-Seidel converges for every initial guess if and only if

\rho(G_{GS})<1,

where $\rho(G_{GS})$ is the spectral radius of the iteration matrix. For a general splitting $A=M-N$ , the corresponding stationary iteration converges when the spectral radius of $M^{-1}N$ is less than one.

For Gauss-Seidel,

M=D+L, \qquad N=-U,

M^{-1}N=-(D+L)^{-1}U.

Thus the convergence condition becomes

\rho\left(-(D+L)^{-1}U\right)<1.

90.9 Strict Diagonal Dominance

A matrix is strictly row diagonally dominant if

$$ |a_{ii}|

\sum_{j\ne i}|a_{ij}| $$

for every row $i$ .

If $A$ is strictly diagonally dominant, then the Gauss-Seidel method converges. This is a standard sufficient condition.

Strict diagonal dominance means that each equation is controlled mainly by its own variable. The off-diagonal coupling is not strong enough to prevent contraction of the iteration error.

This condition is sufficient, but not necessary. Gauss-Seidel may converge even when strict diagonal dominance fails.

90.10 Symmetric Positive Definite Matrices

Another important convergence result applies to symmetric positive definite matrices.

If $A$ is symmetric positive definite, then Gauss-Seidel converges.

This class appears frequently in applications, including:

Source	Matrix type
finite difference discretizations	sparse SPD matrices
finite element methods	sparse SPD matrices
quadratic minimization	Hessian matrices
graph Laplacians with constraints	positive definite reductions

The SPD case is especially important because it also supports conjugate gradient methods.

90.11 Geometric Interpretation

Gauss-Seidel may be interpreted as solving one coordinate direction at a time.

At each step, the method adjusts $x_i$ so that the $i$ -th equation is satisfied using the newest known values.

After updating $x_1$ , the first equation is satisfied relative to the current state. After updating $x_2$ , the second equation is satisfied relative to the updated $x_1$ , and so on.

The method sweeps through the equations. Each sweep reduces error when the coupling structure is favorable.

90.12 Residual Form

The residual at step $k$ is

r^{(k)}=b-Ax^{(k)}.

Gauss-Seidel can be interpreted as applying a lower-triangular correction.

From

(D+L)x^{(k+1)} = -Ux^{(k)}+b,

subtract

(D+L)x^{(k)}

from both sides:

(D+L)(x^{(k+1)}-x^{(k)}) = b-(D+L+U)x^{(k)}.

Thus

(D+L)(x^{(k+1)}-x^{(k)}) = r^{(k)}.

Therefore,

x^{(k+1)} = x^{(k)}+(D+L)^{-1}r^{(k)}.

Gauss-Seidel updates the current approximation by applying a lower-triangular solve to the residual.

90.13 Algorithm

Input:

A,\quad b,\quad x^{(0)},\quad \tau,\quad k_{\max}.

For

k=0,1,\ldots,k_{\max}-1:

update each component in order:

x_i \leftarrow \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j - \sum_{j>i}a_{ij}x_j \right).

Here the same vector $x$ is overwritten in place.

After a full sweep, compute the residual

r=b-Ax.

Stop if

\frac{\|r\|}{\|b\|} \le \tau.

Return the current $x$ .

90.14 Pseudocode

gauss_seidel(A, b, x, tol, max_iter):
    n = length(b)

    for k = 0 to max_iter - 1:
        for i = 1 to n:
            s1 = 0
            s2 = 0

            for j = 1 to i - 1:
                s1 = s1 + A[i,j] * x[j]

            for j = i + 1 to n:
                s2 = s2 + A[i,j] * x[j]

            x[i] = (b[i] - s1 - s2) / A[i,i]

        r = b - A * x

        if norm(r) / norm(b) <= tol:
            return x

    return x

Unlike Jacobi, the method updates $x$ in place.

This is why Gauss-Seidel needs less vector storage but has less natural parallelism.

90.15 Storage Requirements

Gauss-Seidel needs:

Object	Storage
Matrix $A$	problem-dependent
Right-hand side $b$	$n$
Current vector $x$	$n$
Residual, optional	$n$

Jacobi usually stores both $x^{(k)}$ and $x^{(k+1)}$ .

Gauss-Seidel can overwrite $x$ , so it often needs one fewer vector.

For very large sparse systems, this memory difference may matter.

90.16 Computational Cost

For a dense $n\times n$ matrix, one sweep costs

O(n^2).

For a sparse matrix, one sweep costs approximately

O(\operatorname{nnz}(A)).

The cost per iteration is similar to Jacobi.

The difference lies in convergence and parallelism.

Gauss-Seidel usually reduces error faster per sweep, but Jacobi is easier to parallelize.

90.17 Sparse Implementation

For a sparse matrix stored by rows, one Gauss-Seidel update has the form:

for i = 1 to n:
    s = 0
    diag = 0

    for each nonzero (j, aij) in row i:
        if j == i:
            diag = aij
        else:
            s = s + aij * x[j]

    x[i] = (b[i] - s) / diag

This works because values $x[j]$ for $j<i$ have already been updated during the current sweep, while values for $j>i$ still come from the previous sweep.

The ordering of unknowns therefore affects the method.

90.18 Dependence on Ordering

Gauss-Seidel depends on the order in which variables are updated.

Changing the order changes the splitting of $A$ into lower and upper parts.

For some matrices, one ordering may converge faster than another.

Ordering matters especially for sparse matrices from grids or graphs.

Common orderings include:

Ordering	Use
natural ordering	simplest implementation
red-black ordering	exposes parallelism on grids
bandwidth-reducing ordering	improves locality
graph coloring	parallel Gauss-Seidel variants

This dependence is both a weakness and a tool.

90.19 Parallel Gauss-Seidel

Basic Gauss-Seidel is sequential because each update depends on previous updates in the same sweep.

Parallel versions are possible when variables can be grouped so that updates within a group do not depend on each other.

For example, red-black Gauss-Seidel splits grid points into two sets. All red points are updated first, then all black points.

Graph coloring generalizes this idea.

If two variables are not coupled by a nonzero matrix entry, they may be updated simultaneously.

90.20 Symmetric Gauss-Seidel

Symmetric Gauss-Seidel performs two sweeps:

a forward sweep,
a backward sweep.

The forward sweep uses the usual order:

1,2,\ldots,n.

The backward sweep uses the reverse order:

n,n-1,\ldots,1.

This produces a more symmetric operation.

Symmetric Gauss-Seidel is often used as a preconditioner, especially for symmetric systems.

90.21 Gauss-Seidel as Preconditioner

From the residual form,

x^{(k+1)} = x^{(k)}+(D+L)^{-1}r^{(k)}.

This suggests the preconditioner

P=D+L.

Applying the preconditioner means solving a lower triangular system.

This is more expensive than Jacobi preconditioning but may be stronger because it includes lower-triangular coupling.

For symmetric methods, symmetric Gauss-Seidel or SSOR is often preferred.

90.22 Relation to Coordinate Descent

For symmetric positive definite $A$ , solving

Ax=b

is equivalent to minimizing the quadratic function

\phi(x)=\frac{1}{2}x^TAx-b^Tx.

Gauss-Seidel may be interpreted as exact coordinate minimization of this quadratic.

Each update chooses one coordinate $x_i$ while holding the others fixed, minimizing $\phi$ along that coordinate direction.

This interpretation explains why Gauss-Seidel converges for symmetric positive definite systems.

90.23 Failure Modes

Gauss-Seidel may fail or perform poorly for several reasons.

Failure mode	Cause
Divergence	$\rho(G_{GS})\ge 1$
Slow convergence	spectral radius close to $1$
Division by zero	zero diagonal entry
Poor ordering	unfavorable triangular splitting
Stagnation	ill-conditioning or weak smoothing
Misleading residual	ill-conditioned matrix

A robust implementation should check diagonal entries and enforce a maximum iteration count.

90.24 Residual Versus Error

As with all iterative solvers, the residual does not directly equal the error.

For

r^{(k)}=b-Ax^{(k)}

and

e^{(k)}=x-x^{(k)},

we have

Ae^{(k)}=r^{(k)}.

If $A$ is invertible,

e^{(k)}=A^{-1}r^{(k)}.

Thus

\|e^{(k)}\| \le \|A^{-1}\|\|r^{(k)}\|.

If $A$ is ill-conditioned, a small residual may still correspond to a significant solution error.

90.25 Practical Use

Gauss-Seidel is simple, but it is rarely the best standalone method for large difficult systems.

It remains useful because it is:

Role	Reason
teaching method	shows sequential stationary iteration
smoother	useful in multigrid
preconditioner	cheap triangular correction
baseline solver	simple reference method
local relaxation method	natural for grid equations

In modern large-scale linear algebra, Gauss-Seidel is often used inside larger algorithms rather than as the final solver.

90.26 Summary

The Gauss-Seidel method solves

Ax=b

by the iteration

x_i^{(k+1)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j<i}a_{ij}x_j^{(k+1)} - \sum_{j>i}a_{ij}x_j^{(k)} \right).

In matrix form,

(D+L)x^{(k+1)}=-Ux^{(k)}+b.

The iteration matrix is

G_{GS}=-(D+L)^{-1}U.

The method converges for every initial guess exactly when

\rho(G_{GS})<1.

Strict diagonal dominance and symmetric positive definiteness are important sufficient conditions for convergence. Gauss-Seidel often converges faster than Jacobi, but it is more sequential and depends on variable ordering.