Skip to content

Chapter 125. Finite Element Methods

Finite element methods are numerical methods for approximating solutions of differential equations.

They are used when an exact solution is unavailable, when the domain has a complicated shape, or when the coefficients and boundary conditions make analytic methods impractical. Finite element methods appear in structural mechanics, heat transfer, electromagnetics, fluid flow, acoustics, geophysics, biomechanics, and scientific computing.

Linear algebra enters because the method turns a differential equation into a finite system of algebraic equations. The final computational problem usually has the form

Ku=f. Ku=f.

Here KK is a sparse matrix, uu is a vector of unknown coefficients, and ff is a vector of known data. The matrix is assembled from many small local element matrices. This assembly process is one of the defining features of the finite element method. Stanford notes describe the weak formulation as a variational boundary-value problem, and ETH notes emphasize elementwise integration and assembly into sparse linear systems.

125.1 Boundary Value Problems

A boundary value problem asks for an unknown function satisfying a differential equation inside a domain and boundary conditions on the boundary.

Let

ΩRd \Omega \subset \mathbb{R}^d

be a domain.

A simple model problem is the Poisson equation:

Δu=fin Ω, -\Delta u=f \quad \text{in } \Omega,

with boundary condition

u=0on Ω. u=0 \quad \text{on } \partial \Omega.

Here:

SymbolMeaning
Ω\OmegaPhysical domain
Ω\partial\OmegaBoundary of the domain
uuUnknown function
ffGiven source term
Δ\DeltaLaplace operator

The goal is to approximate uu.

125.2 Classical and Weak Forms

A classical solution must have enough derivatives for the differential equation to hold pointwise.

For the Poisson equation,

Δu=f, -\Delta u=f,

a classical solution requires second derivatives.

Finite element methods usually use a weak form instead. The weak form lowers differentiability requirements and expresses the equation through integrals.

Multiply the equation by a test function vv, integrate over Ω\Omega, and use integration by parts:

Ωuvdx=Ωfvdx. \int_{\Omega} \nabla u \cdot \nabla v \, dx = \int_{\Omega} f v \, dx.

This is the weak form.

The unknown uu and test function vv belong to a suitable function space. ETH notes state that this weak problem may have a weak solution that lacks enough smoothness to be classical, while classical solutions satisfy the weak form as well.

125.3 Trial and Test Spaces

The weak form is written as:

Find uVu \in V such that

a(u,v)=(v)for all vV. a(u,v)=\ell(v) \quad \text{for all } v\in V.

Here

a(u,v)=Ωuvdx a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v\,dx

is a bilinear form, and

(v)=Ωfvdx \ell(v)=\int_{\Omega}fv\,dx

is a linear functional.

The space VV contains functions satisfying the essential boundary conditions. For the zero boundary condition u=0u=0 on Ω\partial\Omega, a common choice is

V=H01(Ω). V=H_0^1(\Omega).

The finite element method replaces VV by a finite-dimensional subspace.

125.4 Discretization

The domain Ω\Omega is divided into small pieces called elements.

In one dimension, elements are intervals.

In two dimensions, common elements are triangles and quadrilaterals.

In three dimensions, common elements are tetrahedra and hexahedra.

This division is called a mesh.

A mesh has:

ObjectMeaning
NodesPoints where degrees of freedom are placed
ElementsSmall subdomains
EdgesOne-dimensional element boundaries
FacesTwo-dimensional element boundaries
ConnectivityWhich nodes belong to which elements

The mesh converts a continuous geometric problem into a finite combinatorial structure.

125.5 Finite Element Space

A finite element space is a finite-dimensional space of functions built from local polynomial pieces.

For example, on a triangular mesh, one may use continuous piecewise linear functions.

Each basis function ϕi\phi_i is associated with a node ii.

It satisfies

ϕi(xj)=δij. \phi_i(x_j)=\delta_{ij}.

Thus ϕi\phi_i equals 11 at node ii and 00 at other nodes.

The finite element approximation has the form

uh(x)=j=1NUjϕj(x). u_h(x)=\sum_{j=1}^N U_j\phi_j(x).

The unknowns are the coefficients

U1,,UN. U_1,\ldots,U_N.

These coefficients often represent the approximate values of the solution at mesh nodes.

125.6 Galerkin Method

The Galerkin method uses the same finite-dimensional space for trial and test functions.

Substitute

uh=j=1NUjϕj u_h=\sum_{j=1}^N U_j\phi_j

into the weak form and require

a(uh,ϕi)=(ϕi) a(u_h,\phi_i)=\ell(\phi_i)

for every basis function ϕi\phi_i.

Then

a(j=1NUjϕj,ϕi)=(ϕi). a\left(\sum_{j=1}^N U_j\phi_j,\phi_i\right) = \ell(\phi_i).

By linearity,

j=1NUja(ϕj,ϕi)=(ϕi). \sum_{j=1}^N U_j a(\phi_j,\phi_i) = \ell(\phi_i).

This is a linear system.

125.7 Stiffness Matrix

Define the stiffness matrix KK by

Kij=a(ϕj,ϕi). K_{ij}=a(\phi_j,\phi_i).

For the Poisson problem,

Kij=Ωϕjϕidx. K_{ij} = \int_{\Omega}\nabla \phi_j\cdot\nabla \phi_i\,dx.

Define the load vector FF by

Fi=(ϕi)=Ωfϕidx. F_i=\ell(\phi_i) = \int_{\Omega} f\phi_i\,dx.

Then the finite element equations are

KU=F. KU=F.

This is the main linear algebraic form of the finite element method.

125.8 Sparsity

Finite element matrices are sparse.

Each basis function ϕi\phi_i is nonzero only on elements touching node ii. Therefore

Kij=0 K_{ij}=0

unless the supports of ϕi\phi_i and ϕj\phi_j overlap.

ETH notes highlight this point: the integrals defining the system matrix are nonzero only when basis-function supports overlap, so most entries of the matrix are zero.

Sparsity is essential for large-scale computation.

A three-dimensional finite element model may have millions or billions of degrees of freedom. Dense matrix storage would be impossible. Sparse storage and iterative solvers make the computation feasible.

125.9 Element Matrices

The global stiffness matrix is assembled from local element matrices.

For each element ee, define a local matrix

Kab(e)=Ωeϕb(e)ϕa(e)dx. K^{(e)}_{ab} = \int_{\Omega_e} \nabla \phi_b^{(e)}\cdot\nabla \phi_a^{(e)}\,dx.

Here Ωe\Omega_e is the element domain, and a,ba,b are local node indices.

Each local matrix is small. For a linear triangular element, it is 3×33\times 3. For a linear tetrahedral element, it is 4×44\times 4.

The local matrix contributes to entries of the global matrix according to the element connectivity.

This assembly process is a structured sum of small matrices into one large sparse matrix.

125.10 Assembly

Let an element ee have local nodes

a=1,,m. a=1,\ldots,m.

Let

Ie(a) I_e(a)

be the global node number corresponding to local node aa.

Then assembly adds

Kab(e) K^{(e)}_{ab}

to the global entry

KIe(a),Ie(b). K_{I_e(a),I_e(b)}.

Likewise, the element load vector contributes to the global load vector.

In pseudocode:

for each element e:
    compute local matrix Ke
    compute local vector Fe
    for local nodes a, b:
        K[global(a), global(b)] += Ke[a,b]
    for local node a:
        F[global(a)] += Fe[a]

This pattern is central in finite element implementation.

125.11 Reference Elements

Element integrals are usually computed by mapping from a reference element.

For a triangular element, the reference triangle may be

T^={(ξ,η):ξ0, η0, ξ+η1}. \hat{T} = \{(\xi,\eta): \xi\geq 0,\ \eta\geq 0,\ \xi+\eta\leq 1\}.

The reference basis functions for a linear triangle are

ϕ^1=1ξη, \hat{\phi}_1=1-\xi-\eta, ϕ^2=ξ, \hat{\phi}_2=\xi, ϕ^3=η. \hat{\phi}_3=\eta.

Each physical triangle is obtained from the reference triangle by an affine map.

ETH notes give this standard mapping from the reference triangle to a physical triangle using the three reference shape functions.

125.12 Element Mapping

Let the physical triangle have vertices

x1,x2,x3. x_1,\quad x_2,\quad x_3.

The map from reference coordinates to physical coordinates is

x(ξ,η)=x1ϕ^1(ξ,η)+x2ϕ^2(ξ,η)+x3ϕ^3(ξ,η). x(\xi,\eta) = x_1\hat{\phi}_1(\xi,\eta) + x_2\hat{\phi}_2(\xi,\eta) + x_3\hat{\phi}_3(\xi,\eta).

For an affine triangular element, the Jacobian matrix is constant.

It records how the reference element is stretched, rotated, and sheared to form the physical element.

Integrals transform according to

Ωeg(x)dx=Ω^g(x(ξ))detJdξ. \int_{\Omega_e} g(x)\,dx = \int_{\hat{\Omega}} g(x(\xi)) |\det J|\,d\xi.

This change of variables is how one formula on the reference element can be reused for every physical element.

125.13 Shape Functions

Shape functions are local basis functions on an element.

For a linear triangle, the shape functions are the barycentric coordinate functions.

They satisfy:

PropertyMeaning
ϕa(xb)=δab\phi_a(x_b)=\delta_{ab}Nodal interpolation
ϕ1+ϕ2+ϕ3=1\phi_1+\phi_2+\phi_3=1Partition of unity
Each ϕa\phi_a is linearFirst-order approximation
Gradients are constantSimple stiffness integrals

Higher-order elements use higher-degree polynomials.

For example, quadratic triangular elements have six nodes: three at vertices and three at edge midpoints.

Higher-order elements can improve accuracy when the solution is smooth.

125.14 Mass Matrix

In time-dependent problems, another important matrix appears: the mass matrix.

It is defined by

Mij=Ωϕjϕidx. M_{ij} = \int_{\Omega}\phi_j\phi_i\,dx.

For example, the heat equation

utΔu=f u_t-\Delta u=f

leads to a system of ordinary differential equations:

MU˙+KU=F. M\dot{U}+KU=F.

Here MM comes from the time derivative term, and KK comes from the diffusion term.

The mass matrix is sparse for the same reason as the stiffness matrix: basis functions have local support.

125.15 Boundary Conditions

Boundary conditions must be incorporated into the algebraic system.

Common types are:

TypeMathematical formFEM treatment
Dirichletu=gu=g on boundaryFix degrees of freedom
Neumannun=h\frac{\partial u}{\partial n}=hAdd boundary integral
Robinαu+βun=r\alpha u+\beta \frac{\partial u}{\partial n}=rAdd matrix and vector terms

Dirichlet conditions are called essential boundary conditions because they constrain the trial space.

Neumann conditions are called natural boundary conditions because they appear naturally in the weak form after integration by parts.

125.16 Error and Mesh Size

Let hh denote the mesh size, often the largest element diameter.

The finite element solution uhu_h should approach the exact solution uu as

h0. h\to 0.

This process is called mesh refinement.

For smooth solutions and suitable polynomial spaces, smaller elements give better approximations.

The error depends on:

FactorEffect
Mesh sizeSmaller elements improve resolution
Polynomial degreeHigher degree improves local approximation
Solution smoothnessSmooth solutions converge faster
Element qualityDistorted elements may hurt accuracy
Solver accuracyAlgebraic error affects final result

Finite element analysis studies precise error bounds.

125.17 Energy Interpretation

For many elliptic problems, the weak solution minimizes an energy functional.

For example, the Poisson problem with zero boundary condition corresponds to minimizing

J(v)=12Ωv2dxΩfvdx. J(v) = \frac12\int_{\Omega}|\nabla v|^2\,dx - \int_{\Omega}fv\,dx.

The finite element solution minimizes this energy over the finite element space.

Thus the method is a constrained approximation problem:

uh=argminvhVhJ(vh). u_h = \arg\min_{v_h\in V_h} J(v_h).

This connects finite element methods with projection, least squares, optimization, and variational calculus.

125.18 Linear Elasticity

Finite element methods are widely used in structural mechanics.

In linear elasticity, the unknown is a displacement field

u(x)Rd. u(x)\in\mathbb{R}^d.

The strain is derived from gradients of uu, and the stress is related to strain by material laws.

The finite element system again has the form

KU=F. KU=F.

Here:

ObjectMeaning
KKStructural stiffness matrix
UUNodal displacement vector
FFApplied force vector

The entries of KK encode element geometry, material stiffness, and basis-function gradients.

125.19 Heat Equation

The heat equation is

ut(ku)=f. u_t-\nabla\cdot(k\nabla u)=f.

After finite element discretization in space, it becomes

MU˙+KU=F. M\dot{U}+KU=F.

A time-stepping method then converts this into algebraic systems.

For backward Euler with time step Δt\Delta t,

(M+ΔtK)Un+1=MUn+ΔtFn+1. (M+\Delta t K)U^{n+1} = MU^n+\Delta t F^{n+1}.

Thus each time step requires solving a sparse linear system.

125.20 Eigenvalue Problems

Finite element methods also approximate eigenvalue problems.

For example, vibration modes may satisfy

Ku=λMu. Ku=\lambda Mu.

Here KK is a stiffness matrix and MM is a mass matrix.

The eigenvectors represent mode shapes. The eigenvalues relate to natural frequencies.

This generalized eigenvalue problem appears in structural dynamics, acoustics, quantum mechanics, and wave propagation.

It is a finite-dimensional approximation of an operator eigenvalue problem.

125.21 Nonlinear Problems

Many physical models are nonlinear.

A nonlinear finite element problem has the form

R(U)=0, R(U)=0,

where RR is a residual vector.

Newton’s method linearizes the problem at each iteration:

J(Uk)ΔU=R(Uk), J(U_k)\Delta U=-R(U_k), Uk+1=Uk+ΔU. U_{k+1}=U_k+\Delta U.

Here J(Uk)J(U_k) is the Jacobian matrix of the residual.

Thus nonlinear finite element methods produce a sequence of sparse linear systems.

Linear algebra remains central even when the original model is nonlinear.

125.22 Conditioning

Finite element matrices can be ill-conditioned.

The condition number often grows as the mesh is refined.

For elliptic problems, a typical behavior is

κ(K)h2, \kappa(K) \sim h^{-2},

for standard discretizations.

Poor conditioning slows iterative solvers and amplifies numerical error.

Preconditioning is therefore essential in large-scale finite element computation.

Common preconditioners include incomplete factorizations, multigrid methods, domain decomposition, and algebraic multigrid.

125.23 Iterative Solvers

Large finite element systems are usually solved by iterative methods.

For symmetric positive definite systems, the conjugate gradient method is common.

For nonsymmetric systems, GMRES or BiCGSTAB may be used.

The choice depends on the matrix structure.

Matrix propertyCommon solver
Symmetric positive definiteConjugate gradient
Symmetric indefiniteMINRES
NonsymmetricGMRES
Saddle pointBlock preconditioned Krylov methods
Very large ellipticMultigrid

Sparse matrix-vector multiplication is the main operation in many of these methods.

125.24 Adaptivity

Adaptive finite element methods refine the mesh where more accuracy is needed.

The usual loop is:

StepOperation
SolveCompute uhu_h
EstimateEstimate local error
MarkChoose elements to refine
RefineImprove the mesh
RepeatSolve again

Adaptivity is useful when the solution has local features such as singularities, boundary layers, shocks, cracks, or sharp gradients.

It puts computational effort where it matters most.

125.25 Finite Element Methods and Linear Algebra

The dictionary is direct.

FEM conceptLinear algebra object
Mesh nodeDegree of freedom
Basis functionCoordinate basis vector in function space
Approximate solutionCoefficient vector
Weak formBilinear form equation
Stiffness matrixMatrix of bilinear form
Load vectorVector of linear functional values
AssemblySparse matrix accumulation
Boundary conditionConstraint on unknowns
Time-dependent problemMatrix ODE
Vibration problemGeneralized eigenvalue problem
Nonlinear FEMNewton linear systems
Error reductionProjection and refinement

Finite element methods are finite-dimensional linear algebra applied to function spaces and geometry.

125.26 Summary

Finite element methods approximate differential equations by replacing an infinite-dimensional problem with a finite-dimensional one.

The process begins with a weak form:

a(u,v)=(v). a(u,v)=\ell(v).

A finite element space VhV_h is chosen using a mesh and local polynomial basis functions. The approximate solution is written as

uh=j=1NUjϕj. u_h=\sum_{j=1}^N U_j\phi_j.

Testing against each basis function produces the linear system

KU=F. KU=F.

The stiffness matrix and load vector are computed element by element and assembled into global sparse arrays.

The central principle is localization. A complicated global problem is built from simple local polynomial approximations on small elements. Linear algebra then combines these local contributions into a sparse system that can be solved computationally.