Case Studies

Forward mode automatic differentiation appears in many numerical systems where directional derivatives, local sensitivities, or small parameter sets are important. This chapter examines concrete applications and analyzes why forward mode fits their computational structure.

The emphasis is not merely on using derivatives, but on how tangent propagation integrates with the execution model of each system.

Sensitivity Analysis in Scientific Computing

Many scientific simulations depend on a small set of physical parameters:

\theta = (\theta_1,\ldots,\theta_k).

The simulation output may be extremely large:

f : \mathbb{R}^k \to \mathbb{R}^m, \qquad m \gg k.

Examples:

Simulation	Parameters
climate models	diffusion coefficients
fluid simulation	viscosity
orbital dynamics	gravitational constants
chemical systems	reaction rates
epidemiological models	infection parameters

The key property is:

k \ll m.

Forward mode is efficient because one tangent pass computes sensitivities with respect to one parameter direction.

Example: ODE simulation

Consider the ODE:

\frac{dy}{dt} = \theta y, \qquad y(0)=1.

The solution is:

y(t)=e^{\theta t}.

We want sensitivity with respect to $\theta$ :

\frac{\partial y}{\partial \theta}.

Instead of symbolic differentiation, propagate tangents directly through the numerical integrator.

Seed:

\dot{\theta}=1.

Euler update:

y_{n+1}=y_n+\Delta t \,\theta y_n.

Tangent update:

\dot{y}_{n+1} = \dot{y}_n + \Delta t ( \dot{\theta}y_n + \theta\dot{y}_n ).

The tangent recurrence evolves alongside the primal recurrence.

This approach generalizes to very large ODE and PDE systems.

Newton and Krylov Solvers

Many nonlinear solvers require Jacobian-vector products rather than explicit Jacobians.

Suppose:

F(x)=0.

Newton methods solve:

J_F(x)\Delta x = -F(x).

Krylov solvers such as GMRES often require only repeated evaluations of:

J_F(x)v.

Forward mode computes these products naturally.

Jacobian-free Newton-Krylov

Instead of forming the Jacobian explicitly:

choose direction $v$ ,
seed tangent $\dot{x}=v$ ,
run forward mode,
obtain:

\dot{F}=J_F(x)v.

This avoids explicit matrix construction entirely.

Benefits:

Benefit	Explanation
lower memory	no Jacobian storage
matrix-free methods	operator-only access
easier parallelization	directional evaluations
sparse compatibility	local tangent propagation

Large PDE solvers frequently use this structure.

Robotics and Kinematics

Robot kinematics naturally form chained transformations.

Suppose a robot arm has joint angles:

\theta_1,\ldots,\theta_n.

Forward kinematics computes end-effector position:

p=f(\theta).

The Jacobian:

J_f(\theta)

maps joint velocities into end-effector velocities.

Forward mode matches this physical interpretation directly.

Tangent interpretation

Seed:

\dot{\theta} = (\omega_1,\ldots,\omega_n).

Then:

\dot{p} = J_f(\theta)\dot{\theta}

is the resulting end-effector velocity.

The tangent is literally the instantaneous physical motion.

Chain structure

Robot transforms compose sequentially:

T = T_1T_2\cdots T_n.

Forward tangent propagation becomes:

\dot{T} = \dot{T}_1T_2\cdots T_n + T_1\dot{T}_2\cdots T_n +\cdots.

This aligns naturally with forward-mode accumulation.

Computer Graphics and Rendering

Graphics systems frequently optimize low-dimensional parameters controlling large outputs.

Examples:

Parameters	Outputs
camera pose	image pixels
lighting coefficients	rendered image
material properties	shading fields
skeletal joints	mesh deformation

Again:

k \ll m.

Forward mode efficiently propagates parameter perturbations into rendered outputs.

Differentiable transformations

Suppose a mesh vertex undergoes affine transformation:

p' = Rp+t.

Forward mode propagates tangents:

\dot{p}' = \dot{R}p + R\dot{p} + \dot{t}.

If only camera parameters vary, most scene data has zero tangent.

Sparse tangent propagation becomes highly effective.

Optimization with Few Parameters

Some optimization problems have large outputs but few parameters.

Example:

f : \mathbb{R}^3 \to \mathbb{R}^{10^6}.

Suppose:

3 calibration parameters,
one million measurements.

Forward mode computes the entire Jacobian in only three passes.

Reverse mode would instead require propagating adjoints from every output.

Forward mode is therefore preferable.

Applications:

Domain	Parameters
camera calibration	lens coefficients
physical fitting	material constants
system identification	low-dimensional models
experimental tuning	control parameters

Automatic Differentiation Inside Solvers

Many numerical algorithms are themselves iterative programs.

Example:

for k in 1..N:
    x = g(x)

Forward mode differentiates through the iteration directly.

Fixed-point iteration

Suppose:

x_{k+1}=g(x_k,\theta).

The tangent recurrence becomes:

\dot{x}_{k+1} = \frac{\partial g}{\partial x}\dot{x}_k + \frac{\partial g}{\partial \theta}\dot{\theta}.

This computes parameter sensitivity during convergence.

Applications:

Algorithm	Sensitivity
nonlinear solvers	parameter dependence
iterative PDE solvers	coefficient variation
simulation loops	control perturbations
optimization iterations	hyperparameter effects

Circuit Simulation

Electronic circuits naturally produce sparse derivative structures.

Each component interacts only locally.

Example resistor equation:

I=\frac{V}{R}.

Forward tangent:

\dot{I} = \frac{\dot{V}}{R} - \frac{V}{R^2}\dot{R}.

Large circuits contain millions of local equations. Sparse forward mode propagates only locally active derivatives.

Graph coloring and compressed seeding are heavily used in circuit simulation AD systems.

Computational Fluid Dynamics

Fluid simulations involve massive sparse systems.

Discretized Navier-Stokes equations often have stencil structure:

u_i^{t+1} = F(u_{i-1}^t,u_i^t,u_{i+1}^t).

Each update depends only on neighboring cells.

The Jacobian is sparse and structured.

Forward mode efficiently propagates sensitivities such as:

Parameter	Meaning
viscosity	turbulence sensitivity
boundary conditions	flow response
forcing terms	pressure response

Sparse seeding dramatically reduces tangent dimension.

Neural Ordinary Differential Equations

Neural ODEs define dynamics:

\frac{dh}{dt}=f(h,t,\theta).

Forward mode propagates tangent dynamics:

\frac{d\dot{h}}{dt} = \frac{\partial f}{\partial h}\dot{h} + \frac{\partial f}{\partial \theta}\dot{\theta}.

This is called the variational equation.

Forward mode is efficient when:

parameter dimension is small,
only selected sensitivities are needed,
directional perturbations matter more than full gradients.

Implicit Differentiation

Consider an implicitly defined system:

F(x,\theta)=0.

Differentiate:

\frac{\partial F}{\partial x}\dot{x} + \frac{\partial F}{\partial \theta}\dot{\theta} = 0.

Rearrange:

\dot{x} = - \left( \frac{\partial F}{\partial x} \right)^{-1} \frac{\partial F}{\partial \theta}\dot{\theta}.

Forward mode computes:

\frac{\partial F}{\partial \theta}\dot{\theta}

directly as a JVP.

This is fundamental in:

constrained optimization,
equilibrium models,
differentiable physics,
differentiable optimization layers.

Probabilistic Programming

Probabilistic systems often involve small parameter perturbations.

Suppose:

\log p(x|\theta)

depends on a moderate number of parameters.

Forward mode efficiently computes:

J_{\log p}(\theta)v.

Applications:

Task	Use
Fisher information	directional curvature
sensitivity analysis	posterior perturbation
variational inference	local updates
uncertainty propagation	parameter effects

Forward mode integrates well with sampling-based systems because tangent propagation follows the primal execution path.

Real-Time Systems

Forward mode has low memory overhead because it does not require a backward pass.

This is important in:

embedded systems,
robotics controllers,
streaming simulation,
online optimization,
real-time estimation.

Reverse mode often requires storing intermediate states. Forward mode can propagate tangents online during execution.

Streaming example

Suppose sensor updates arrive continuously:

while true:
    state = update(state, sensor)

Forward mode updates tangents incrementally:

state, tangent = update(state, tangent, sensor)

No global tape is required.

Differentiable Databases

Some differentiable query systems propagate sensitivities through relational operations.

Suppose:

Q_\theta(D)

depends on tunable parameters.

Forward mode propagates tangent information through:

joins,
aggregations,
ranking functions,
retrieval scores.

Example:

s_i = \theta^\top x_i.

Tangent:

\dot{s}_i = \dot{\theta}^\top x_i.

This enables:

sensitivity-aware ranking,
differentiable retrieval,
query optimization,
gradient-guided search.

Hyperparameter Sensitivity

Training systems often study sensitivity to hyperparameters:

Hyperparameter	Example
learning rate	optimizer stability
regularization	generalization
scheduler constants	convergence speed
physical coefficients	simulation behavior

Forward mode efficiently computes directional effects of small hyperparameter perturbations without recomputing the entire optimization process separately.

When Forward Mode Fails

Forward mode becomes inefficient when input dimension is extremely large.

Example:

f : \mathbb{R}^{10^9} \to \mathbb{R}.

Computing the full gradient by forward mode requires approximately:

10^9

forward passes.

This is why deep neural network training uses reverse mode.

Forward mode also struggles when:

tangent dimensions become dense,
memory bandwidth dominates,
tangent vectors exceed cache capacity,
derivative structure lacks locality.

Hybrid Systems

Modern AD systems often combine modes.

Examples:

Combination	Use
forward-over-reverse	Hessian-vector products
reverse-over-forward	Jacobian rows
sparse-forward-over-reverse	structured second derivatives
block-forward + reverse	mixed tensor systems

Forward mode is therefore rarely isolated. It acts as a building block inside larger differentiation systems.

Summary

Forward mode automatic differentiation is especially effective when:

the number of input directions is small,
directional sensitivities are sufficient,
Jacobians are sparse,
memory efficiency matters,
online propagation is needed.

Its natural computation is the Jacobian-vector product:

J_f(x)v.

This operator form appears throughout scientific computing, robotics, simulation, optimization, graphics, probabilistic systems, and differentiable infrastructure.