Skip to content

Computational Fluid Dynamics

Computational fluid dynamics studies fluid motion by solving discretized forms of the governing equations. Automatic differentiation enters CFD when we want gradients of...

Computational fluid dynamics studies fluid motion by solving discretized forms of the governing equations. Automatic differentiation enters CFD when we want gradients of simulation outputs with respect to geometry, boundary conditions, material parameters, control inputs, or model parameters.

A CFD solver is usually a large numerical program:

geometrymeshdiscrete equationslinear and nonlinear solvesflow fieldobjective. \text{geometry} \to \text{mesh} \to \text{discrete equations} \to \text{linear and nonlinear solves} \to \text{flow field} \to \text{objective}.

The objective may be drag, lift, pressure loss, heat transfer, acoustic noise, mixing efficiency, or deviation from measured data. AD turns this pipeline into a differentiable map.

Governing Equations

For many engineering flows, the starting point is the Navier-Stokes equations. In incompressible form,

u=0, \nabla \cdot u = 0, ρ(ut+uu)=p+μΔu+f. \rho \left( \frac{\partial u}{\partial t} + u \cdot \nabla u \right) = -\nabla p + \mu \Delta u + f.

Here uu is velocity, pp is pressure, ρ\rho is density, μ\mu is dynamic viscosity, and ff is an external force.

A numerical solver discretizes these equations in space and time. After discretization, the continuous PDE becomes a finite-dimensional system such as

R(U,θ)=0, R(U,\theta)=0,

where UU is the discrete flow state and θ\theta contains parameters such as geometry, boundary values, viscosity, or turbulence model coefficients.

Differentiating a CFD Solver

A steady-state CFD objective can be written as

L=(U,θ),R(U,θ)=0. L = \ell(U,\theta), \qquad R(U,\theta)=0.

Direct differentiation gives

RUdUdθ+Rθ=0. R_U \frac{dU}{d\theta} + R_\theta = 0.

Therefore,

$$ \frac{dU}{d\theta} =

  • R_U^{-1}R_\theta. $$

Substituting into the derivative of LL,

dLdθ=Lθ+LUdUdθ. \frac{dL}{d\theta} = L_\theta + L_U \frac{dU}{d\theta}.

For many parameters, computing dU/dθdU/d\theta directly is too expensive. The adjoint method avoids this by solving

RUλ=LU, R_U^\top \lambda = L_U^\top,

then computing

dLdθ=LθλRθ. \frac{dL}{d\theta} = L_\theta - \lambda^\top R_\theta.

This is the core reason adjoint methods dominate gradient-based CFD design. One adjoint solve gives the gradient of one scalar objective with respect to many design variables.

What AD Provides

AD can compute the derivatives needed by the adjoint method:

QuantityMeaning
RUvR_U vLinearized residual action
RUvR_U^\top vTransposed linearized residual action
RθvR_\theta vSensitivity to design variables
LUL_UObjective derivative with respect to flow state
LθL_\thetaDirect objective derivative

In large CFD codes, these matrices are rarely formed densely. Instead, AD is used to generate matrix-free products or sparse derivative kernels.

This fits CFD well because residuals are local. Each cell or element depends mostly on neighboring cells. The Jacobian has sparse structure.

Shape Optimization

A common use case is aerodynamic shape optimization.

The parameter θ\theta defines geometry:

θΩ(θ), \theta \to \Omega(\theta),

where Ω\Omega is the flow domain. The solver computes a flow field UU, and the objective might be drag:

L(θ)=D(U(θ),Ω(θ)). L(\theta)=D(U(\theta),\Omega(\theta)).

The gradient

θL \nabla_\theta L

tells how to modify the shape to reduce drag or increase lift.

A practical shape optimization loop is:

  1. deform geometry,
  2. update or regenerate mesh,
  3. solve flow equations,
  4. compute objective,
  5. solve adjoint equations,
  6. compute shape gradient,
  7. update design variables.

The difficult part is not only differentiating the flow equations. The mesh motion and geometry representation must also be differentiable.

Mesh Dependence

CFD solvers depend heavily on meshes. The mesh affects discretization error, stability, and derivative quality.

If mesh coordinates are XX, the residual is more accurately written as

R(U,X,θ)=0. R(U,X,\theta)=0.

When design parameters change geometry, they also change mesh coordinates:

X=X(θ). X = X(\theta).

Then the total derivative includes mesh terms:

dRdθ=RUdUdθ+RXdXdθ+Rθ. \frac{dR}{d\theta} = R_U \frac{dU}{d\theta} + R_X \frac{dX}{d\theta} + R_\theta.

Ignoring the mesh derivative gives inconsistent gradients.

This is a frequent source of errors in differentiable CFD systems. A solver may differentiate the fluid residual correctly while omitting the geometry-to-mesh path.

Time-Dependent Flows

For unsteady CFD, the state evolves over time:

Uk+1=Φk(Uk,θ). U_{k+1} = \Phi_k(U_k,\theta).

The loss may depend on the entire trajectory:

L=k=0Nk(Uk,θ). L = \sum_{k=0}^{N} \ell_k(U_k,\theta).

Reverse-mode AD propagates adjoints backward through time:

Uˉk=Uˉk+1ΦkUk+kUk. \bar{U}_k = \bar{U}_{k+1} \frac{\partial \Phi_k}{\partial U_k} + \frac{\partial \ell_k}{\partial U_k}.

The memory problem is severe. A high-resolution simulation may have millions of state variables and thousands of time steps. Storing all states is often impossible.

Checkpointing is therefore central. The solver stores selected time states and recomputes intermediate states during the adjoint pass.

Turbulence Models

Most engineering CFD uses turbulence models rather than resolving all turbulent scales. These models add closure equations, empirical coefficients, wall functions, limiters, and nonlinear switches.

For AD, turbulence models are challenging because they often contain:

FeatureAD issue
LimitersPiecewise derivatives
Wall functionsNon-smooth formulas near boundaries
ClippingZero gradients outside active regions
Empirical switchesDiscontinuous control flow
Iterative closuresNested solver differentiation

A formally differentiated turbulence model may produce poor gradients if the model contains hard thresholds. Smoothing, custom derivative rules, or derivative-aware model design may be needed.

Discrete vs Continuous Adjoints

CFD has a long tradition of adjoint methods. Two approaches are common.

A continuous adjoint derives adjoint PDEs analytically from the continuous governing equations, then discretizes them.

A discrete adjoint differentiates the discretized residual or solver.

ApproachAdvantageRisk
Continuous adjointMathematically compact; close to PDE theoryMay disagree with discrete objective
Discrete adjointGradient matches numerical solverMore tied to implementation details

AD naturally supports discrete adjoints. This is valuable because gradient-based optimization needs the derivative of the computed objective, not only the idealized continuous one.

Linear Solves and Transpose Solves

Implicit CFD solvers repeatedly solve systems such as

AΔU=b. A \Delta U = b.

Reverse differentiation of a linear solve requires solving the transpose system:

Aλ=ΔUˉ. A^\top \lambda = \bar{\Delta U}.

Then adjoints propagate to bb and AA. In production solvers, this rule should be implemented directly.

Differentiating through every Krylov iteration is possible, but often inferior. It creates long tapes, depends on convergence details, and may produce gradients tied to arbitrary stopping criteria.

A solver-aware AD implementation treats the converged linear solve as a primitive with a custom adjoint.

Conservation and Gradient Consistency

CFD discretizations are often designed to preserve conservation laws. A derivative system should respect the same structure.

For example, finite volume methods compute fluxes across cell faces. The same face flux contributes with opposite signs to neighboring cells. If the primal residual is conservative, the linearized residual should preserve that cancellation structure.

AD can help because it differentiates the implemented flux code directly. But careless transformations can break structure through:

CauseEffect
Inconsistent boundary treatmentWrong shape gradients
Missing mesh termsBiased sensitivities
Non-differentiated limitersGradient mismatch
Dense Jacobian materializationInfeasible memory use
Solver tolerance noiseNoisy gradients

Gradient checks using finite differences remain important, especially on reduced meshes.

CFD and Machine Learning

AD also appears in CFD when machine learning models are embedded inside solvers.

Examples include:

  • learned turbulence closures,
  • differentiable surrogate models,
  • neural constitutive laws,
  • flow control policies,
  • data assimilation systems.

If a neural model appears inside the residual,

R(U,θ,w)=0, R(U,\theta,w)=0,

where ww are learned weights, then AD can compute gradients with respect to both physical design variables and neural parameters.

The main challenge is coupling. Neural network AD frameworks are optimized for dense tensor programs, while CFD solvers rely on sparse meshes, irregular memory access, and iterative linear algebra. Efficient integration requires clear derivative interfaces rather than treating the entire solver as a generic neural network layer.

Practical Architecture

A differentiable CFD system should expose derivative rules at the same level as the numerical method.

ComponentPreferred derivative treatment
Flux functionLocal JVP/VJP
Boundary conditionExplicit derivative rule
Mesh deformationDifferentiable geometry map
Linear solveCustom transpose-solve rule
Nonlinear solveImplicit differentiation or discrete adjoint
Time integrationCheckpointed reverse pass
Objective functionalDirect AD or analytic derivative
Turbulence closureSmoothed or custom derivative

This design gives AD enough structure to be efficient while preserving the exact semantics of the implemented solver.

Example: Drag Minimization

A simplified drag minimization problem can be written as

minθD(U,θ) \min_\theta D(U,\theta)

subject to

R(U,θ)=0. R(U,\theta)=0.

The Lagrangian is

L(U,θ,λ)=D(U,θ)λR(U,θ). \mathcal{L}(U,\theta,\lambda) = D(U,\theta) - \lambda^\top R(U,\theta).

The adjoint equation is

RUλ=DU. R_U^\top \lambda = D_U^\top.

The gradient is

θD=DθλRθ. \nabla_\theta D = D_\theta - \lambda^\top R_\theta.

This gives the derivative of drag with respect to many shape parameters after one flow solve and one adjoint solve.

The optimization loop then updates the design:

θk+1=θkηθD. \theta_{k+1} = \theta_k - \eta \nabla_\theta D.

In real aerodynamic design, this update is constrained by geometry validity, mesh quality, lift requirements, structural limits, and manufacturing constraints.

Failure Modes

Differentiable CFD systems fail in recognizable ways.

Failure modeCause
Gradient disagrees with finite differencesMissing derivative path or inconsistent adjoint
Gradient is noisySolver tolerance or chaotic unsteady flow
Optimization destroys meshGeometry update lacks constraints
Adjoint solve divergesIll-conditioned linearized operator
Gradient vanishes locallyHard clipping or inactive limiter
Huge memory useNaive reverse mode through time integration
Poor design updateObjective poorly scaled or constrained

These failures are usually systems issues, not AD theory issues.

Summary

In CFD, automatic differentiation provides the derivative infrastructure for design optimization, inverse modeling, data assimilation, and differentiable simulation. The central object is the derivative of a large sparse solver-defined map.

Naive AD through an entire CFD code rarely works well at production scale. Effective differentiable CFD combines AD with adjoint methods, sparse linear algebra, checkpointing, custom rules for solvers, and careful treatment of geometry and mesh dependence.