Chapter 125. Finite Element Methods

Finite element methods are numerical methods for approximating solutions of differential equations.

They are used when an exact solution is unavailable, when the domain has a complicated shape, or when the coefficients and boundary conditions make analytic methods impractical. Finite element methods appear in structural mechanics, heat transfer, electromagnetics, fluid flow, acoustics, geophysics, biomechanics, and scientific computing.

Linear algebra enters because the method turns a differential equation into a finite system of algebraic equations. The final computational problem usually has the form

Ku=f.

Here $K$ is a sparse matrix, $u$ is a vector of unknown coefficients, and $f$ is a vector of known data. The matrix is assembled from many small local element matrices. This assembly process is one of the defining features of the finite element method. Stanford notes describe the weak formulation as a variational boundary-value problem, and ETH notes emphasize elementwise integration and assembly into sparse linear systems.

125.1 Boundary Value Problems

A boundary value problem asks for an unknown function satisfying a differential equation inside a domain and boundary conditions on the boundary.

Let

\Omega \subset \mathbb{R}^d

be a domain.

A simple model problem is the Poisson equation:

-\Delta u=f \quad \text{in } \Omega,

with boundary condition

u=0 \quad \text{on } \partial \Omega.

Here:

Symbol	Meaning
$\Omega$	Physical domain
$\partial\Omega$	Boundary of the domain
$u$	Unknown function
$f$	Given source term
$\Delta$	Laplace operator

The goal is to approximate $u$ .

125.2 Classical and Weak Forms

A classical solution must have enough derivatives for the differential equation to hold pointwise.

For the Poisson equation,

-\Delta u=f,

a classical solution requires second derivatives.

Finite element methods usually use a weak form instead. The weak form lowers differentiability requirements and expresses the equation through integrals.

Multiply the equation by a test function $v$ , integrate over $\Omega$ , and use integration by parts:

\int_{\Omega} \nabla u \cdot \nabla v \, dx = \int_{\Omega} f v \, dx.

This is the weak form.

The unknown $u$ and test function $v$ belong to a suitable function space. ETH notes state that this weak problem may have a weak solution that lacks enough smoothness to be classical, while classical solutions satisfy the weak form as well.

125.3 Trial and Test Spaces

The weak form is written as:

Find $u \in V$ such that

a(u,v)=\ell(v) \quad \text{for all } v\in V.

Here

a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,dx

is a bilinear form, and

\ell(v)=\int_{\Omega}fv\,dx

is a linear functional.

The space $V$ contains functions satisfying the essential boundary conditions. For the zero boundary condition $u=0$ on $\partial\Omega$ , a common choice is

V=H_0^1(\Omega).

The finite element method replaces $V$ by a finite-dimensional subspace.

125.4 Discretization

The domain $\Omega$ is divided into small pieces called elements.

In one dimension, elements are intervals.

In two dimensions, common elements are triangles and quadrilaterals.

In three dimensions, common elements are tetrahedra and hexahedra.

This division is called a mesh.

A mesh has:

Object	Meaning
Nodes	Points where degrees of freedom are placed
Elements	Small subdomains
Edges	One-dimensional element boundaries
Faces	Two-dimensional element boundaries
Connectivity	Which nodes belong to which elements

The mesh converts a continuous geometric problem into a finite combinatorial structure.

125.5 Finite Element Space

A finite element space is a finite-dimensional space of functions built from local polynomial pieces.

For example, on a triangular mesh, one may use continuous piecewise linear functions.

Each basis function $\phi_i$ is associated with a node $i$ .

It satisfies

\phi_i(x_j)=\delta_{ij}.

Thus $\phi_i$ equals $1$ at node $i$ and $0$ at other nodes.

The finite element approximation has the form

u_h(x)=\sum_{j=1}^N U_j\phi_j(x).

The unknowns are the coefficients

U_1,\ldots,U_N.

These coefficients often represent the approximate values of the solution at mesh nodes.

125.6 Galerkin Method

The Galerkin method uses the same finite-dimensional space for trial and test functions.

Substitute

u_h=\sum_{j=1}^N U_j\phi_j

into the weak form and require

a(u_h,\phi_i)=\ell(\phi_i)

for every basis function $\phi_i$ .

Then

a\left(\sum_{j=1}^N U_j\phi_j,\phi_i\right) = \ell(\phi_i).

By linearity,

\sum_{j=1}^N U_j a(\phi_j,\phi_i) = \ell(\phi_i).

This is a linear system.

125.7 Stiffness Matrix

Define the stiffness matrix $K$ by

K_{ij}=a(\phi_j,\phi_i).

For the Poisson problem,

K_{ij} = \int_{\Omega}\nabla \phi_j\cdot\nabla \phi_i\,dx.

Define the load vector $F$ by

F_i=\ell(\phi_i) = \int_{\Omega} f\phi_i\,dx.

Then the finite element equations are

KU=F.

This is the main linear algebraic form of the finite element method.

125.8 Sparsity

Finite element matrices are sparse.

Each basis function $\phi_i$ is nonzero only on elements touching node $i$ . Therefore

K_{ij}=0

unless the supports of $\phi_i$ and $\phi_j$ overlap.

ETH notes highlight this point: the integrals defining the system matrix are nonzero only when basis-function supports overlap, so most entries of the matrix are zero.

Sparsity is essential for large-scale computation.

A three-dimensional finite element model may have millions or billions of degrees of freedom. Dense matrix storage would be impossible. Sparse storage and iterative solvers make the computation feasible.

125.9 Element Matrices

The global stiffness matrix is assembled from local element matrices.

For each element $e$ , define a local matrix

K^{(e)}_{ab} = \int_{\Omega_e} \nabla \phi_b^{(e)}\cdot\nabla \phi_a^{(e)}\,dx.

Here $\Omega_e$ is the element domain, and $a,b$ are local node indices.

Each local matrix is small. For a linear triangular element, it is $3\times 3$ . For a linear tetrahedral element, it is $4\times 4$ .

The local matrix contributes to entries of the global matrix according to the element connectivity.

This assembly process is a structured sum of small matrices into one large sparse matrix.

125.10 Assembly

Let an element $e$ have local nodes

a=1,\ldots,m.

Let

I_e(a)

be the global node number corresponding to local node $a$ .

Then assembly adds

K^{(e)}_{ab}

to the global entry

K_{I_e(a),I_e(b)}.

Likewise, the element load vector contributes to the global load vector.

In pseudocode:

for each element e:
    compute local matrix Ke
    compute local vector Fe
    for local nodes a, b:
        K[global(a), global(b)] += Ke[a,b]
    for local node a:
        F[global(a)] += Fe[a]

This pattern is central in finite element implementation.

125.11 Reference Elements

Element integrals are usually computed by mapping from a reference element.

For a triangular element, the reference triangle may be

\hat{T} = \{(\xi,\eta): \xi\geq 0,\ \eta\geq 0,\ \xi+\eta\leq 1\}.

The reference basis functions for a linear triangle are

\hat{\phi}_1=1-\xi-\eta,

\hat{\phi}_2=\xi,

\hat{\phi}_3=\eta.

Each physical triangle is obtained from the reference triangle by an affine map.

ETH notes give this standard mapping from the reference triangle to a physical triangle using the three reference shape functions.

125.12 Element Mapping

Let the physical triangle have vertices

x_1,\quad x_2,\quad x_3.

The map from reference coordinates to physical coordinates is

x(\xi,\eta) = x_1\hat{\phi}_1(\xi,\eta) + x_2\hat{\phi}_2(\xi,\eta) + x_3\hat{\phi}_3(\xi,\eta).

For an affine triangular element, the Jacobian matrix is constant.

It records how the reference element is stretched, rotated, and sheared to form the physical element.

Integrals transform according to

\int_{\Omega_e} g(x)\,dx = \int_{\hat{\Omega}} g(x(\xi)) |\det J|\,d\xi.

This change of variables is how one formula on the reference element can be reused for every physical element.

125.13 Shape Functions

Shape functions are local basis functions on an element.

For a linear triangle, the shape functions are the barycentric coordinate functions.

They satisfy:

Property	Meaning
$\phi_a(x_b)=\delta_{ab}$	Nodal interpolation
$\phi_1+\phi_2+\phi_3=1$	Partition of unity
Each $\phi_a$ is linear	First-order approximation
Gradients are constant	Simple stiffness integrals

Higher-order elements use higher-degree polynomials.

For example, quadratic triangular elements have six nodes: three at vertices and three at edge midpoints.

Higher-order elements can improve accuracy when the solution is smooth.

125.14 Mass Matrix

In time-dependent problems, another important matrix appears: the mass matrix.

It is defined by

M_{ij} = \int_{\Omega}\phi_j\phi_i\,dx.

For example, the heat equation

u_t-\Delta u=f

leads to a system of ordinary differential equations:

M\dot{U}+KU=F.

Here $M$ comes from the time derivative term, and $K$ comes from the diffusion term.

The mass matrix is sparse for the same reason as the stiffness matrix: basis functions have local support.

125.15 Boundary Conditions

Boundary conditions must be incorporated into the algebraic system.

Common types are:

Type	Mathematical form	FEM treatment
Dirichlet	$u=g$ on boundary	Fix degrees of freedom
Neumann	$\frac{\partial u}{\partial n}=h$	Add boundary integral
Robin	$\alpha u+\beta \frac{\partial u}{\partial n}=r$	Add matrix and vector terms

Dirichlet conditions are called essential boundary conditions because they constrain the trial space.

Neumann conditions are called natural boundary conditions because they appear naturally in the weak form after integration by parts.

125.16 Error and Mesh Size

Let $h$ denote the mesh size, often the largest element diameter.

The finite element solution $u_h$ should approach the exact solution $u$ as

h\to 0.

This process is called mesh refinement.

For smooth solutions and suitable polynomial spaces, smaller elements give better approximations.

The error depends on:

Factor	Effect
Mesh size	Smaller elements improve resolution
Polynomial degree	Higher degree improves local approximation
Solution smoothness	Smooth solutions converge faster
Element quality	Distorted elements may hurt accuracy
Solver accuracy	Algebraic error affects final result

Finite element analysis studies precise error bounds.

125.17 Energy Interpretation

For many elliptic problems, the weak solution minimizes an energy functional.

For example, the Poisson problem with zero boundary condition corresponds to minimizing

J(v) = \frac12\int_{\Omega}|\nabla v|^2\,dx - \int_{\Omega}fv\,dx.

The finite element solution minimizes this energy over the finite element space.

Thus the method is a constrained approximation problem:

u_h = \arg\min_{v_h\in V_h} J(v_h).

This connects finite element methods with projection, least squares, optimization, and variational calculus.

125.18 Linear Elasticity

Finite element methods are widely used in structural mechanics.

In linear elasticity, the unknown is a displacement field

u(x)\in\mathbb{R}^d.

The strain is derived from gradients of $u$ , and the stress is related to strain by material laws.

The finite element system again has the form

KU=F.

Here:

Object	Meaning
$K$	Structural stiffness matrix
$U$	Nodal displacement vector
$F$	Applied force vector

The entries of $K$ encode element geometry, material stiffness, and basis-function gradients.

125.19 Heat Equation

The heat equation is

u_t-\nabla\cdot(k\nabla u)=f.

After finite element discretization in space, it becomes

M\dot{U}+KU=F.

A time-stepping method then converts this into algebraic systems.

For backward Euler with time step $\Delta t$ ,

(M+\Delta t K)U^{n+1} = MU^n+\Delta t F^{n+1}.

Thus each time step requires solving a sparse linear system.

125.20 Eigenvalue Problems

Finite element methods also approximate eigenvalue problems.

For example, vibration modes may satisfy

Ku=\lambda Mu.

Here $K$ is a stiffness matrix and $M$ is a mass matrix.

The eigenvectors represent mode shapes. The eigenvalues relate to natural frequencies.

This generalized eigenvalue problem appears in structural dynamics, acoustics, quantum mechanics, and wave propagation.

It is a finite-dimensional approximation of an operator eigenvalue problem.

125.21 Nonlinear Problems

Many physical models are nonlinear.

A nonlinear finite element problem has the form

R(U)=0,

where $R$ is a residual vector.

Newton’s method linearizes the problem at each iteration:

J(U_k)\Delta U=-R(U_k),

U_{k+1}=U_k+\Delta U.

Here $J(U_k)$ is the Jacobian matrix of the residual.

Thus nonlinear finite element methods produce a sequence of sparse linear systems.

Linear algebra remains central even when the original model is nonlinear.

125.22 Conditioning

Finite element matrices can be ill-conditioned.

The condition number often grows as the mesh is refined.

For elliptic problems, a typical behavior is

\kappa(K) \sim h^{-2},

for standard discretizations.

Poor conditioning slows iterative solvers and amplifies numerical error.

Preconditioning is therefore essential in large-scale finite element computation.

Common preconditioners include incomplete factorizations, multigrid methods, domain decomposition, and algebraic multigrid.

125.23 Iterative Solvers

Large finite element systems are usually solved by iterative methods.

For symmetric positive definite systems, the conjugate gradient method is common.

For nonsymmetric systems, GMRES or BiCGSTAB may be used.

The choice depends on the matrix structure.

Matrix property	Common solver
Symmetric positive definite	Conjugate gradient
Symmetric indefinite	MINRES
Nonsymmetric	GMRES
Saddle point	Block preconditioned Krylov methods
Very large elliptic	Multigrid

Sparse matrix-vector multiplication is the main operation in many of these methods.

125.24 Adaptivity

Adaptive finite element methods refine the mesh where more accuracy is needed.

The usual loop is:

Step	Operation
Solve	Compute $u_h$
Estimate	Estimate local error
Mark	Choose elements to refine
Refine	Improve the mesh
Repeat	Solve again

Adaptivity is useful when the solution has local features such as singularities, boundary layers, shocks, cracks, or sharp gradients.

It puts computational effort where it matters most.

125.25 Finite Element Methods and Linear Algebra

The dictionary is direct.

FEM concept	Linear algebra object
Mesh node	Degree of freedom
Basis function	Coordinate basis vector in function space
Approximate solution	Coefficient vector
Weak form	Bilinear form equation
Stiffness matrix	Matrix of bilinear form
Load vector	Vector of linear functional values
Assembly	Sparse matrix accumulation
Boundary condition	Constraint on unknowns
Time-dependent problem	Matrix ODE
Vibration problem	Generalized eigenvalue problem
Nonlinear FEM	Newton linear systems
Error reduction	Projection and refinement

Finite element methods are finite-dimensional linear algebra applied to function spaces and geometry.

125.26 Summary

Finite element methods approximate differential equations by replacing an infinite-dimensional problem with a finite-dimensional one.

The process begins with a weak form:

a(u,v)=\ell(v).

A finite element space $V_h$ is chosen using a mesh and local polynomial basis functions. The approximate solution is written as

u_h=\sum_{j=1}^N U_j\phi_j.

Testing against each basis function produces the linear system

KU=F.

The stiffness matrix and load vector are computed element by element and assembled into global sparse arrays.

The central principle is localization. A complicated global problem is built from simple local polynomial approximations on small elements. Linear algebra then combines these local contributions into a sparse system that can be solved computationally.