# Chapter 90. Gauss-Seidel Method

# Chapter 90. Gauss-Seidel Method

The Gauss-Seidel method is a stationary iterative method for solving a linear system

$$
Ax=b.
$$

It is closely related to the Jacobi method. Both methods solve each equation for one unknown and repeat the process. The difference is that Gauss-Seidel uses the newest available values immediately. When the method computes \(x_i^{(k+1)}\), it uses already updated values \(x_1^{(k+1)}, \ldots, x_{i-1}^{(k+1)}\) and old values \(x_{i+1}^{(k)}, \ldots, x_n^{(k)}\).

This small change often improves convergence. It also makes the method less naturally parallel than Jacobi.

Gauss-Seidel is guaranteed to converge for important matrix classes, including strictly diagonally dominant matrices and symmetric positive definite matrices. More generally, convergence is controlled by the spectral radius of its iteration matrix.

## 90.1 The Basic System

Let

$$
A=
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n}\\
a_{21} & a_{22} & \cdots & a_{2n}\\
\vdots & \vdots & \ddots & \vdots\\
a_{n1} & a_{n2} & \cdots & a_{nn}
\end{bmatrix},
\qquad
x=
\begin{bmatrix}
x_1\\
x_2\\
\vdots\\
x_n
\end{bmatrix},
\qquad
b=
\begin{bmatrix}
b_1\\
b_2\\
\vdots\\
b_n
\end{bmatrix}.
$$

The system

$$
Ax=b
$$

means

$$
a_{i1}x_1+a_{i2}x_2+\cdots+a_{in}x_n=b_i,
\qquad i=1,\ldots,n.
$$

Assume each diagonal entry is nonzero:

$$
a_{ii}\ne 0.
$$

Solving the \(i\)-th equation for \(x_i\) gives

$$
x_i =
\frac{1}{a_{ii}}
\left(
b_i-\sum_{j<i}a_{ij}x_j-\sum_{j>i}a_{ij}x_j
\right).
$$

The Gauss-Seidel method uses this identity as an update rule.

## 90.2 Gauss-Seidel Update Formula

Starting with an initial guess

$$
x^{(0)},
$$

the Gauss-Seidel method defines

$$
x_i^{(k+1)} =
\frac{1}{a_{ii}}
\left(
b_i -
\sum_{j<i}a_{ij}x_j^{(k+1)} -
\sum_{j>i}a_{ij}x_j^{(k)}
\right),
\qquad i=1,\ldots,n.
$$

The first sum uses new values from the current iteration.

The second sum uses old values from the previous iteration.

This is the defining feature of Gauss-Seidel.

## 90.3 Comparison with Jacobi

Jacobi computes all new components from old components:

$$
x_i^{(k+1)} =
\frac{1}{a_{ii}}
\left(
b_i-\sum_{j\ne i}a_{ij}x_j^{(k)}
\right).
$$

Gauss-Seidel computes components sequentially and immediately reuses updated values:

$$
x_i^{(k+1)} =
\frac{1}{a_{ii}}
\left(
b_i -
\sum_{j<i}a_{ij}x_j^{(k+1)} -
\sum_{j>i}a_{ij}x_j^{(k)}
\right).
$$

| Feature | Jacobi | Gauss-Seidel |
|---|---|---|
| Update style | simultaneous | sequential |
| Uses newest values | no | yes |
| Needs separate new vector | yes | no |
| Parallelism | high | lower |
| Typical convergence | slower | faster |
| Iteration matrix | \(-D^{-1}(L+U)\) | \(-(D+L)^{-1}U\) |

Gauss-Seidel may be viewed as a Jacobi-like method with immediate feedback.

## 90.4 Example

Consider

$$
\begin{aligned}
4x_1+x_2 &= 9,\\
x_1+3x_2 &= 7.
\end{aligned}
$$

Solving each equation for its diagonal variable gives

$$
x_1=\frac{9-x_2}{4},
\qquad
x_2=\frac{7-x_1}{3}.
$$

The Gauss-Seidel iteration is

$$
x_1^{(k+1)} =
\frac{9-x_2^{(k)}}{4},
$$

$$
x_2^{(k+1)} =
\frac{7-x_1^{(k+1)}}{3}.
$$

Start with

$$
x^{(0)} =
\begin{bmatrix}
0\\
0
\end{bmatrix}.
$$

First update:

$$
x_1^{(1)}=\frac{9-0}{4}=2.25.
$$

Then use this new value immediately:

$$
x_2^{(1)}=\frac{7-2.25}{3} =
1.5833\ldots.
$$

Thus

$$
x^{(1)} =
\begin{bmatrix}
2.25\\
1.5833
\end{bmatrix}.
$$

Second update:

$$
x_1^{(2)} =
\frac{9-1.5833}{4} =
1.8542\ldots,
$$

$$
x_2^{(2)} =
\frac{7-1.8542}{3} =
1.7153\ldots.
$$

The exact solution is

$$
x=
\begin{bmatrix}
2\\
5/3
\end{bmatrix}.
$$

The iterates move toward the exact solution.

## 90.5 Matrix Splitting

Write

$$
A=D+L+U,
$$

where:

| Symbol | Meaning |
|---|---|
| \(D\) | diagonal part of \(A\) |
| \(L\) | strictly lower triangular part of \(A\) |
| \(U\) | strictly upper triangular part of \(A\) |

Then

$$
Ax=b
$$

becomes

$$
(D+L+U)x=b.
$$

Move the strictly upper triangular part to the right:

$$
(D+L)x=-Ux+b.
$$

The Gauss-Seidel iteration is

$$
(D+L)x^{(k+1)} =
-Ux^{(k)}+b.
$$

Since \(D+L\) is lower triangular, each iteration requires one triangular solve.

## 90.6 Fixed-Point Form

The iteration may be written as

$$
x^{(k+1)} =
-(D+L)^{-1}Ux^{(k)}
+
(D+L)^{-1}b.
$$

Thus the iteration matrix is

$$
G_{GS}=-(D+L)^{-1}U.
$$

The fixed-point form is

$$
x^{(k+1)}=G_{GS}x^{(k)}+c,
$$

where

$$
c=(D+L)^{-1}b.
$$

If the sequence converges to a vector \(x\), then

$$
x=G_{GS}x+c.
$$

Multiplying by \(D+L\), we recover

$$
(D+L+U)x=b.
$$

Hence the limit solves the original system.

## 90.7 Error Recurrence

Let \(x\) be the exact solution. Since

$$
x=G_{GS}x+c
$$

and

$$
x^{(k+1)}=G_{GS}x^{(k)}+c,
$$

subtracting gives

$$
x-x^{(k+1)} =
G_{GS}(x-x^{(k)}).
$$

With

$$
e^{(k)}=x-x^{(k)},
$$

we obtain

$$
e^{(k+1)}=G_{GS}e^{(k)}.
$$

Therefore,

$$
e^{(k)}=G_{GS}^{k}e^{(0)}.
$$

Convergence depends on whether powers of \(G_{GS}\) approach zero.

## 90.8 Convergence Criterion

Gauss-Seidel converges for every initial guess if and only if

$$
\rho(G_{GS})<1,
$$

where \(\rho(G_{GS})\) is the spectral radius of the iteration matrix. For a general splitting \(A=M-N\), the corresponding stationary iteration converges when the spectral radius of \(M^{-1}N\) is less than one.

For Gauss-Seidel,

$$
M=D+L,
\qquad
N=-U,
$$

so

$$
M^{-1}N=-(D+L)^{-1}U.
$$

Thus the convergence condition becomes

$$
\rho\left(-(D+L)^{-1}U\right)<1.
$$

## 90.9 Strict Diagonal Dominance

A matrix is strictly row diagonally dominant if

$$
|a_{ii}|
>
\sum_{j\ne i}|a_{ij}|
$$

for every row \(i\).

If \(A\) is strictly diagonally dominant, then the Gauss-Seidel method converges. This is a standard sufficient condition.

Strict diagonal dominance means that each equation is controlled mainly by its own variable. The off-diagonal coupling is not strong enough to prevent contraction of the iteration error.

This condition is sufficient, but not necessary. Gauss-Seidel may converge even when strict diagonal dominance fails.

## 90.10 Symmetric Positive Definite Matrices

Another important convergence result applies to symmetric positive definite matrices.

If \(A\) is symmetric positive definite, then Gauss-Seidel converges.

This class appears frequently in applications, including:

| Source | Matrix type |
|---|---|
| finite difference discretizations | sparse SPD matrices |
| finite element methods | sparse SPD matrices |
| quadratic minimization | Hessian matrices |
| graph Laplacians with constraints | positive definite reductions |

The SPD case is especially important because it also supports conjugate gradient methods.

## 90.11 Geometric Interpretation

Gauss-Seidel may be interpreted as solving one coordinate direction at a time.

At each step, the method adjusts \(x_i\) so that the \(i\)-th equation is satisfied using the newest known values.

After updating \(x_1\), the first equation is satisfied relative to the current state. After updating \(x_2\), the second equation is satisfied relative to the updated \(x_1\), and so on.

The method sweeps through the equations. Each sweep reduces error when the coupling structure is favorable.

## 90.12 Residual Form

The residual at step \(k\) is

$$
r^{(k)}=b-Ax^{(k)}.
$$

Gauss-Seidel can be interpreted as applying a lower-triangular correction.

From

$$
(D+L)x^{(k+1)} =
-Ux^{(k)}+b,
$$

subtract

$$
(D+L)x^{(k)}
$$

from both sides:

$$
(D+L)(x^{(k+1)}-x^{(k)}) =
b-(D+L+U)x^{(k)}.
$$

Thus

$$
(D+L)(x^{(k+1)}-x^{(k)}) =
r^{(k)}.
$$

Therefore,

$$
x^{(k+1)} =
x^{(k)}+(D+L)^{-1}r^{(k)}.
$$

Gauss-Seidel updates the current approximation by applying a lower-triangular solve to the residual.

## 90.13 Algorithm

Input:

$$
A,\quad b,\quad x^{(0)},\quad \tau,\quad k_{\max}.
$$

For

$$
k=0,1,\ldots,k_{\max}-1:
$$

update each component in order:

$$
x_i
\leftarrow
\frac{1}{a_{ii}}
\left(
b_i -
\sum_{j<i}a_{ij}x_j -
\sum_{j>i}a_{ij}x_j
\right).
$$

Here the same vector \(x\) is overwritten in place.

After a full sweep, compute the residual

$$
r=b-Ax.
$$

Stop if

$$
\frac{\|r\|}{\|b\|}
\le \tau.
$$

Return the current \(x\).

## 90.14 Pseudocode

```text
gauss_seidel(A, b, x, tol, max_iter):
    n = length(b)

    for k = 0 to max_iter - 1:
        for i = 1 to n:
            s1 = 0
            s2 = 0

            for j = 1 to i - 1:
                s1 = s1 + A[i,j] * x[j]

            for j = i + 1 to n:
                s2 = s2 + A[i,j] * x[j]

            x[i] = (b[i] - s1 - s2) / A[i,i]

        r = b - A * x

        if norm(r) / norm(b) <= tol:
            return x

    return x
```

Unlike Jacobi, the method updates \(x\) in place.

This is why Gauss-Seidel needs less vector storage but has less natural parallelism.

## 90.15 Storage Requirements

Gauss-Seidel needs:

| Object | Storage |
|---|---:|
| Matrix \(A\) | problem-dependent |
| Right-hand side \(b\) | \(n\) |
| Current vector \(x\) | \(n\) |
| Residual, optional | \(n\) |

Jacobi usually stores both \(x^{(k)}\) and \(x^{(k+1)}\).

Gauss-Seidel can overwrite \(x\), so it often needs one fewer vector.

For very large sparse systems, this memory difference may matter.

## 90.16 Computational Cost

For a dense \(n\times n\) matrix, one sweep costs

$$
O(n^2).
$$

For a sparse matrix, one sweep costs approximately

$$
O(\operatorname{nnz}(A)).
$$

The cost per iteration is similar to Jacobi.

The difference lies in convergence and parallelism.

Gauss-Seidel usually reduces error faster per sweep, but Jacobi is easier to parallelize.

## 90.17 Sparse Implementation

For a sparse matrix stored by rows, one Gauss-Seidel update has the form:

```text
for i = 1 to n:
    s = 0
    diag = 0

    for each nonzero (j, aij) in row i:
        if j == i:
            diag = aij
        else:
            s = s + aij * x[j]

    x[i] = (b[i] - s) / diag
```

This works because values \(x[j]\) for \(j<i\) have already been updated during the current sweep, while values for \(j>i\) still come from the previous sweep.

The ordering of unknowns therefore affects the method.

## 90.18 Dependence on Ordering

Gauss-Seidel depends on the order in which variables are updated.

Changing the order changes the splitting of \(A\) into lower and upper parts.

For some matrices, one ordering may converge faster than another.

Ordering matters especially for sparse matrices from grids or graphs.

Common orderings include:

| Ordering | Use |
|---|---|
| natural ordering | simplest implementation |
| red-black ordering | exposes parallelism on grids |
| bandwidth-reducing ordering | improves locality |
| graph coloring | parallel Gauss-Seidel variants |

This dependence is both a weakness and a tool.

## 90.19 Parallel Gauss-Seidel

Basic Gauss-Seidel is sequential because each update depends on previous updates in the same sweep.

Parallel versions are possible when variables can be grouped so that updates within a group do not depend on each other.

For example, red-black Gauss-Seidel splits grid points into two sets. All red points are updated first, then all black points.

Graph coloring generalizes this idea.

If two variables are not coupled by a nonzero matrix entry, they may be updated simultaneously.

## 90.20 Symmetric Gauss-Seidel

Symmetric Gauss-Seidel performs two sweeps:

1. a forward sweep,
2. a backward sweep.

The forward sweep uses the usual order:

$$
1,2,\ldots,n.
$$

The backward sweep uses the reverse order:

$$
n,n-1,\ldots,1.
$$

This produces a more symmetric operation.

Symmetric Gauss-Seidel is often used as a preconditioner, especially for symmetric systems.

## 90.21 Gauss-Seidel as Preconditioner

From the residual form,

$$
x^{(k+1)} =
x^{(k)}+(D+L)^{-1}r^{(k)}.
$$

This suggests the preconditioner

$$
P=D+L.
$$

Applying the preconditioner means solving a lower triangular system.

This is more expensive than Jacobi preconditioning but may be stronger because it includes lower-triangular coupling.

For symmetric methods, symmetric Gauss-Seidel or SSOR is often preferred.

## 90.22 Relation to Coordinate Descent

For symmetric positive definite \(A\), solving

$$
Ax=b
$$

is equivalent to minimizing the quadratic function

$$
\phi(x)=\frac{1}{2}x^TAx-b^Tx.
$$

Gauss-Seidel may be interpreted as exact coordinate minimization of this quadratic.

Each update chooses one coordinate \(x_i\) while holding the others fixed, minimizing \(\phi\) along that coordinate direction.

This interpretation explains why Gauss-Seidel converges for symmetric positive definite systems.

## 90.23 Failure Modes

Gauss-Seidel may fail or perform poorly for several reasons.

| Failure mode | Cause |
|---|---|
| Divergence | \(\rho(G_{GS})\ge 1\) |
| Slow convergence | spectral radius close to \(1\) |
| Division by zero | zero diagonal entry |
| Poor ordering | unfavorable triangular splitting |
| Stagnation | ill-conditioning or weak smoothing |
| Misleading residual | ill-conditioned matrix |

A robust implementation should check diagonal entries and enforce a maximum iteration count.

## 90.24 Residual Versus Error

As with all iterative solvers, the residual does not directly equal the error.

For

$$
r^{(k)}=b-Ax^{(k)}
$$

and

$$
e^{(k)}=x-x^{(k)},
$$

we have

$$
Ae^{(k)}=r^{(k)}.
$$

If \(A\) is invertible,

$$
e^{(k)}=A^{-1}r^{(k)}.
$$

Thus

$$
\|e^{(k)}\|
\le
\|A^{-1}\|\|r^{(k)}\|.
$$

If \(A\) is ill-conditioned, a small residual may still correspond to a significant solution error.

## 90.25 Practical Use

Gauss-Seidel is simple, but it is rarely the best standalone method for large difficult systems.

It remains useful because it is:

| Role | Reason |
|---|---|
| teaching method | shows sequential stationary iteration |
| smoother | useful in multigrid |
| preconditioner | cheap triangular correction |
| baseline solver | simple reference method |
| local relaxation method | natural for grid equations |

In modern large-scale linear algebra, Gauss-Seidel is often used inside larger algorithms rather than as the final solver.

## 90.26 Summary

The Gauss-Seidel method solves

$$
Ax=b
$$

by the iteration

$$
x_i^{(k+1)} =
\frac{1}{a_{ii}}
\left(
b_i -
\sum_{j<i}a_{ij}x_j^{(k+1)} -
\sum_{j>i}a_{ij}x_j^{(k)}
\right).
$$

In matrix form,

$$
(D+L)x^{(k+1)}=-Ux^{(k)}+b.
$$

The iteration matrix is

$$
G_{GS}=-(D+L)^{-1}U.
$$

The method converges for every initial guess exactly when

$$
\rho(G_{GS})<1.
$$

Strict diagonal dominance and symmetric positive definiteness are important sufficient conditions for convergence. Gauss-Seidel often converges faster than Jacobi, but it is more sequential and depends on variable ordering.