# Chapter 125. Finite Element Methods

# Chapter 125. Finite Element Methods

Finite element methods are numerical methods for approximating solutions of differential equations.

They are used when an exact solution is unavailable, when the domain has a complicated shape, or when the coefficients and boundary conditions make analytic methods impractical. Finite element methods appear in structural mechanics, heat transfer, electromagnetics, fluid flow, acoustics, geophysics, biomechanics, and scientific computing.

Linear algebra enters because the method turns a differential equation into a finite system of algebraic equations. The final computational problem usually has the form

$$
Ku=f.
$$

Here \(K\) is a sparse matrix, \(u\) is a vector of unknown coefficients, and \(f\) is a vector of known data. The matrix is assembled from many small local element matrices. This assembly process is one of the defining features of the finite element method. Stanford notes describe the weak formulation as a variational boundary-value problem, and ETH notes emphasize elementwise integration and assembly into sparse linear systems.

## 125.1 Boundary Value Problems

A boundary value problem asks for an unknown function satisfying a differential equation inside a domain and boundary conditions on the boundary.

Let

$$
\Omega \subset \mathbb{R}^d
$$

be a domain.

A simple model problem is the Poisson equation:

$$
-\Delta u=f
\quad \text{in } \Omega,
$$

with boundary condition

$$
u=0
\quad \text{on } \partial \Omega.
$$

Here:

| Symbol | Meaning |
|---|---|
| \(\Omega\) | Physical domain |
| \(\partial\Omega\) | Boundary of the domain |
| \(u\) | Unknown function |
| \(f\) | Given source term |
| \(\Delta\) | Laplace operator |

The goal is to approximate \(u\).

## 125.2 Classical and Weak Forms

A classical solution must have enough derivatives for the differential equation to hold pointwise.

For the Poisson equation,

$$
-\Delta u=f,
$$

a classical solution requires second derivatives.

Finite element methods usually use a weak form instead. The weak form lowers differentiability requirements and expresses the equation through integrals.

Multiply the equation by a test function \(v\), integrate over \(\Omega\), and use integration by parts:

$$
\int_{\Omega} \nabla u \cdot \nabla v \, dx =
\int_{\Omega} f v \, dx.
$$

This is the weak form.

The unknown \(u\) and test function \(v\) belong to a suitable function space. ETH notes state that this weak problem may have a weak solution that lacks enough smoothness to be classical, while classical solutions satisfy the weak form as well.

## 125.3 Trial and Test Spaces

The weak form is written as:

Find \(u \in V\) such that

$$
a(u,v)=\ell(v)
\quad \text{for all } v\in V.
$$

Here

$$
a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,dx
$$

is a bilinear form, and

$$
\ell(v)=\int_{\Omega}fv\,dx
$$

is a linear functional.

The space \(V\) contains functions satisfying the essential boundary conditions. For the zero boundary condition \(u=0\) on \(\partial\Omega\), a common choice is

$$
V=H_0^1(\Omega).
$$

The finite element method replaces \(V\) by a finite-dimensional subspace.

## 125.4 Discretization

The domain \(\Omega\) is divided into small pieces called elements.

In one dimension, elements are intervals.

In two dimensions, common elements are triangles and quadrilaterals.

In three dimensions, common elements are tetrahedra and hexahedra.

This division is called a mesh.

A mesh has:

| Object | Meaning |
|---|---|
| Nodes | Points where degrees of freedom are placed |
| Elements | Small subdomains |
| Edges | One-dimensional element boundaries |
| Faces | Two-dimensional element boundaries |
| Connectivity | Which nodes belong to which elements |

The mesh converts a continuous geometric problem into a finite combinatorial structure.

## 125.5 Finite Element Space

A finite element space is a finite-dimensional space of functions built from local polynomial pieces.

For example, on a triangular mesh, one may use continuous piecewise linear functions.

Each basis function \(\phi_i\) is associated with a node \(i\).

It satisfies

$$
\phi_i(x_j)=\delta_{ij}.
$$

Thus \(\phi_i\) equals \(1\) at node \(i\) and \(0\) at other nodes.

The finite element approximation has the form

$$
u_h(x)=\sum_{j=1}^N U_j\phi_j(x).
$$

The unknowns are the coefficients

$$
U_1,\ldots,U_N.
$$

These coefficients often represent the approximate values of the solution at mesh nodes.

## 125.6 Galerkin Method

The Galerkin method uses the same finite-dimensional space for trial and test functions.

Substitute

$$
u_h=\sum_{j=1}^N U_j\phi_j
$$

into the weak form and require

$$
a(u_h,\phi_i)=\ell(\phi_i)
$$

for every basis function \(\phi_i\).

Then

$$
a\left(\sum_{j=1}^N U_j\phi_j,\phi_i\right) =
\ell(\phi_i).
$$

By linearity,

$$
\sum_{j=1}^N U_j a(\phi_j,\phi_i) =
\ell(\phi_i).
$$

This is a linear system.

## 125.7 Stiffness Matrix

Define the stiffness matrix \(K\) by

$$
K_{ij}=a(\phi_j,\phi_i).
$$

For the Poisson problem,

$$
K_{ij} =
\int_{\Omega}\nabla \phi_j\cdot\nabla \phi_i\,dx.
$$

Define the load vector \(F\) by

$$
F_i=\ell(\phi_i) =
\int_{\Omega} f\phi_i\,dx.
$$

Then the finite element equations are

$$
KU=F.
$$

This is the main linear algebraic form of the finite element method.

## 125.8 Sparsity

Finite element matrices are sparse.

Each basis function \(\phi_i\) is nonzero only on elements touching node \(i\). Therefore

$$
K_{ij}=0
$$

unless the supports of \(\phi_i\) and \(\phi_j\) overlap.

ETH notes highlight this point: the integrals defining the system matrix are nonzero only when basis-function supports overlap, so most entries of the matrix are zero.

Sparsity is essential for large-scale computation.

A three-dimensional finite element model may have millions or billions of degrees of freedom. Dense matrix storage would be impossible. Sparse storage and iterative solvers make the computation feasible.

## 125.9 Element Matrices

The global stiffness matrix is assembled from local element matrices.

For each element \(e\), define a local matrix

$$
K^{(e)}_{ab} =
\int_{\Omega_e}
\nabla \phi_b^{(e)}\cdot\nabla \phi_a^{(e)}\,dx.
$$

Here \(\Omega_e\) is the element domain, and \(a,b\) are local node indices.

Each local matrix is small. For a linear triangular element, it is \(3\times 3\). For a linear tetrahedral element, it is \(4\times 4\).

The local matrix contributes to entries of the global matrix according to the element connectivity.

This assembly process is a structured sum of small matrices into one large sparse matrix.

## 125.10 Assembly

Let an element \(e\) have local nodes

$$
a=1,\ldots,m.
$$

Let

$$
I_e(a)
$$

be the global node number corresponding to local node \(a\).

Then assembly adds

$$
K^{(e)}_{ab}
$$

to the global entry

$$
K_{I_e(a),I_e(b)}.
$$

Likewise, the element load vector contributes to the global load vector.

In pseudocode:

```text
for each element e:
    compute local matrix Ke
    compute local vector Fe
    for local nodes a, b:
        K[global(a), global(b)] += Ke[a,b]
    for local node a:
        F[global(a)] += Fe[a]
```

This pattern is central in finite element implementation.

## 125.11 Reference Elements

Element integrals are usually computed by mapping from a reference element.

For a triangular element, the reference triangle may be

$$
\hat{T} =
\{(\xi,\eta): \xi\geq 0,\ \eta\geq 0,\ \xi+\eta\leq 1\}.
$$

The reference basis functions for a linear triangle are

$$
\hat{\phi}_1=1-\xi-\eta,
$$

$$
\hat{\phi}_2=\xi,
$$

$$
\hat{\phi}_3=\eta.
$$

Each physical triangle is obtained from the reference triangle by an affine map.

ETH notes give this standard mapping from the reference triangle to a physical triangle using the three reference shape functions.

## 125.12 Element Mapping

Let the physical triangle have vertices

$$
x_1,\quad x_2,\quad x_3.
$$

The map from reference coordinates to physical coordinates is

$$
x(\xi,\eta) =
x_1\hat{\phi}_1(\xi,\eta)
+
x_2\hat{\phi}_2(\xi,\eta)
+
x_3\hat{\phi}_3(\xi,\eta).
$$

For an affine triangular element, the Jacobian matrix is constant.

It records how the reference element is stretched, rotated, and sheared to form the physical element.

Integrals transform according to

$$
\int_{\Omega_e} g(x)\,dx =
\int_{\hat{\Omega}} g(x(\xi)) |\det J|\,d\xi.
$$

This change of variables is how one formula on the reference element can be reused for every physical element.

## 125.13 Shape Functions

Shape functions are local basis functions on an element.

For a linear triangle, the shape functions are the barycentric coordinate functions.

They satisfy:

| Property | Meaning |
|---|---|
| \(\phi_a(x_b)=\delta_{ab}\) | Nodal interpolation |
| \(\phi_1+\phi_2+\phi_3=1\) | Partition of unity |
| Each \(\phi_a\) is linear | First-order approximation |
| Gradients are constant | Simple stiffness integrals |

Higher-order elements use higher-degree polynomials.

For example, quadratic triangular elements have six nodes: three at vertices and three at edge midpoints.

Higher-order elements can improve accuracy when the solution is smooth.

## 125.14 Mass Matrix

In time-dependent problems, another important matrix appears: the mass matrix.

It is defined by

$$
M_{ij} =
\int_{\Omega}\phi_j\phi_i\,dx.
$$

For example, the heat equation

$$
u_t-\Delta u=f
$$

leads to a system of ordinary differential equations:

$$
M\dot{U}+KU=F.
$$

Here \(M\) comes from the time derivative term, and \(K\) comes from the diffusion term.

The mass matrix is sparse for the same reason as the stiffness matrix: basis functions have local support.

## 125.15 Boundary Conditions

Boundary conditions must be incorporated into the algebraic system.

Common types are:

| Type | Mathematical form | FEM treatment |
|---|---|---|
| Dirichlet | \(u=g\) on boundary | Fix degrees of freedom |
| Neumann | \(\frac{\partial u}{\partial n}=h\) | Add boundary integral |
| Robin | \(\alpha u+\beta \frac{\partial u}{\partial n}=r\) | Add matrix and vector terms |

Dirichlet conditions are called essential boundary conditions because they constrain the trial space.

Neumann conditions are called natural boundary conditions because they appear naturally in the weak form after integration by parts.

## 125.16 Error and Mesh Size

Let \(h\) denote the mesh size, often the largest element diameter.

The finite element solution \(u_h\) should approach the exact solution \(u\) as

$$
h\to 0.
$$

This process is called mesh refinement.

For smooth solutions and suitable polynomial spaces, smaller elements give better approximations.

The error depends on:

| Factor | Effect |
|---|---|
| Mesh size | Smaller elements improve resolution |
| Polynomial degree | Higher degree improves local approximation |
| Solution smoothness | Smooth solutions converge faster |
| Element quality | Distorted elements may hurt accuracy |
| Solver accuracy | Algebraic error affects final result |

Finite element analysis studies precise error bounds.

## 125.17 Energy Interpretation

For many elliptic problems, the weak solution minimizes an energy functional.

For example, the Poisson problem with zero boundary condition corresponds to minimizing

$$
J(v) =
\frac12\int_{\Omega}|\nabla v|^2\,dx -
\int_{\Omega}fv\,dx.
$$

The finite element solution minimizes this energy over the finite element space.

Thus the method is a constrained approximation problem:

$$
u_h =
\arg\min_{v_h\in V_h} J(v_h).
$$

This connects finite element methods with projection, least squares, optimization, and variational calculus.

## 125.18 Linear Elasticity

Finite element methods are widely used in structural mechanics.

In linear elasticity, the unknown is a displacement field

$$
u(x)\in\mathbb{R}^d.
$$

The strain is derived from gradients of \(u\), and the stress is related to strain by material laws.

The finite element system again has the form

$$
KU=F.
$$

Here:

| Object | Meaning |
|---|---|
| \(K\) | Structural stiffness matrix |
| \(U\) | Nodal displacement vector |
| \(F\) | Applied force vector |

The entries of \(K\) encode element geometry, material stiffness, and basis-function gradients.

## 125.19 Heat Equation

The heat equation is

$$
u_t-\nabla\cdot(k\nabla u)=f.
$$

After finite element discretization in space, it becomes

$$
M\dot{U}+KU=F.
$$

A time-stepping method then converts this into algebraic systems.

For backward Euler with time step \(\Delta t\),

$$
(M+\Delta t K)U^{n+1} =
MU^n+\Delta t F^{n+1}.
$$

Thus each time step requires solving a sparse linear system.

## 125.20 Eigenvalue Problems

Finite element methods also approximate eigenvalue problems.

For example, vibration modes may satisfy

$$
Ku=\lambda Mu.
$$

Here \(K\) is a stiffness matrix and \(M\) is a mass matrix.

The eigenvectors represent mode shapes. The eigenvalues relate to natural frequencies.

This generalized eigenvalue problem appears in structural dynamics, acoustics, quantum mechanics, and wave propagation.

It is a finite-dimensional approximation of an operator eigenvalue problem.

## 125.21 Nonlinear Problems

Many physical models are nonlinear.

A nonlinear finite element problem has the form

$$
R(U)=0,
$$

where \(R\) is a residual vector.

Newton's method linearizes the problem at each iteration:

$$
J(U_k)\Delta U=-R(U_k),
$$

$$
U_{k+1}=U_k+\Delta U.
$$

Here \(J(U_k)\) is the Jacobian matrix of the residual.

Thus nonlinear finite element methods produce a sequence of sparse linear systems.

Linear algebra remains central even when the original model is nonlinear.

## 125.22 Conditioning

Finite element matrices can be ill-conditioned.

The condition number often grows as the mesh is refined.

For elliptic problems, a typical behavior is

$$
\kappa(K) \sim h^{-2},
$$

for standard discretizations.

Poor conditioning slows iterative solvers and amplifies numerical error.

Preconditioning is therefore essential in large-scale finite element computation.

Common preconditioners include incomplete factorizations, multigrid methods, domain decomposition, and algebraic multigrid.

## 125.23 Iterative Solvers

Large finite element systems are usually solved by iterative methods.

For symmetric positive definite systems, the conjugate gradient method is common.

For nonsymmetric systems, GMRES or BiCGSTAB may be used.

The choice depends on the matrix structure.

| Matrix property | Common solver |
|---|---|
| Symmetric positive definite | Conjugate gradient |
| Symmetric indefinite | MINRES |
| Nonsymmetric | GMRES |
| Saddle point | Block preconditioned Krylov methods |
| Very large elliptic | Multigrid |

Sparse matrix-vector multiplication is the main operation in many of these methods.

## 125.24 Adaptivity

Adaptive finite element methods refine the mesh where more accuracy is needed.

The usual loop is:

| Step | Operation |
|---|---|
| Solve | Compute \(u_h\) |
| Estimate | Estimate local error |
| Mark | Choose elements to refine |
| Refine | Improve the mesh |
| Repeat | Solve again |

Adaptivity is useful when the solution has local features such as singularities, boundary layers, shocks, cracks, or sharp gradients.

It puts computational effort where it matters most.

## 125.25 Finite Element Methods and Linear Algebra

The dictionary is direct.

| FEM concept | Linear algebra object |
|---|---|
| Mesh node | Degree of freedom |
| Basis function | Coordinate basis vector in function space |
| Approximate solution | Coefficient vector |
| Weak form | Bilinear form equation |
| Stiffness matrix | Matrix of bilinear form |
| Load vector | Vector of linear functional values |
| Assembly | Sparse matrix accumulation |
| Boundary condition | Constraint on unknowns |
| Time-dependent problem | Matrix ODE |
| Vibration problem | Generalized eigenvalue problem |
| Nonlinear FEM | Newton linear systems |
| Error reduction | Projection and refinement |

Finite element methods are finite-dimensional linear algebra applied to function spaces and geometry.

## 125.26 Summary

Finite element methods approximate differential equations by replacing an infinite-dimensional problem with a finite-dimensional one.

The process begins with a weak form:

$$
a(u,v)=\ell(v).
$$

A finite element space \(V_h\) is chosen using a mesh and local polynomial basis functions. The approximate solution is written as

$$
u_h=\sum_{j=1}^N U_j\phi_j.
$$

Testing against each basis function produces the linear system

$$
KU=F.
$$

The stiffness matrix and load vector are computed element by element and assembled into global sparse arrays.

The central principle is localization. A complicated global problem is built from simple local polynomial approximations on small elements. Linear algebra then combines these local contributions into a sparse system that can be solved computationally.
