Forward mode automatic differentiation computes derivatives by propagating tangent values alongside ordinary values. The ordinary value is called the primal. The derivative...
Forward mode automatic differentiation computes derivatives by propagating tangent values alongside ordinary values. The ordinary value is called the primal. The derivative value is called the tangent.
For every program variable , forward mode tracks:
The dot notation means “the change in induced by a chosen change in the input.”
If the input changes in direction , then every later value changes according to the chain rule.
Directional Derivatives
Let:
At input , choose a direction:
Forward mode computes:
where is the Jacobian.
Thus forward mode computes a directional derivative. It tells us how the output changes when the input moves infinitesimally in direction .
Primal and Tangent Execution
Consider:
A program evaluates:
v1 = x * x
v2 = 3 * x
v3 = v1 + v2
return v3Forward mode evaluates a paired program:
v1 = x * x
dot_v1 = x * dot_x + x * dot_x
v2 = 3 * x
dot_v2 = 3 * dot_x
v3 = v1 + v2
dot_v3 = dot_v1 + dot_v2
return v3, dot_v3If , then:
which is the ordinary derivative.
Local Tangent Rules
Each primitive operation has a tangent rule.
| Operation | Primal | Tangent |
|---|---|---|
| addition | ||
| subtraction | ||
| multiplication | ||
| division | ||
| sine | ||
| exponential | ||
| logarithm |
These rules are applied in the same order as the program.
Tangent Seeding
The initial tangent determines the derivative being computed.
For one scalar input:
computes:
For vector input , choosing:
computes the -th column of the Jacobian.
Example:
Seed:
Then forward mode computes the first Jacobian column:
Seed:
Then it computes the second column:
Tangents Through Control Flow
Forward mode follows the same control flow as the primal program.
Example:
if x > 0:
y = x * x
else:
y = -xIf :
If :
At , the function has a corner. Forward mode returns the tangent of the executed branch. It does not infer a symbolic piecewise derivative.
This is important for programs with:
- thresholds
- comparisons
- clipping
- indexing
- data-dependent branches
Tangents Through Loops
Forward mode handles loops directly because tangent propagation follows primal execution.
Example:
y = x
for i in range(n):
y = y * yThe tangent program is:
y = x
dot_y = dot_x
for i in range(n):
old_y = y
old_dot_y = dot_y
y = old_y * old_y
dot_y = old_y * old_dot_y + old_y * old_dot_yEach iteration propagates the tangent through the loop body.
For fixed , this computes the derivative of the function represented by the loop.
Tangents for Structured Values
Real programs use arrays, tuples, structs, and nested containers.
Forward mode assigns tangents only to differentiable components.
Example:
State {
position: Float
velocity: Float
step: Int
}A tangent state may contain:
StateTangent {
position: Float
velocity: Float
}The integer step has no tangent because discrete values do not vary infinitesimally.
This distinction becomes important in differentiable programming languages and compiler IRs.
Multiple Tangent Directions
Forward mode can carry several tangent directions at once.
Instead of:
use:
This computes:
where is a matrix of seed directions.
Multi-direction forward mode is useful for:
- computing several Jacobian columns
- exploiting SIMD and vector hardware
- reducing interpreter overhead
- sparse Jacobian recovery
The cost grows roughly linearly with the number of tangent directions, but batching can improve constants.
Memory Behavior
Forward mode has simple memory behavior.
At each point, it needs only:
- current primal values
- current tangent values
It does not need to store a full tape for a later backward pass.
Thus forward mode is attractive for:
- streaming computations
- long simulations
- online sensitivity analysis
- low-memory systems
Its weakness is dimensional scaling. A full gradient of a scalar function with inputs requires forward passes.
Tangent Propagation as Pushforward
In differential geometry, tangent propagation is a pushforward.
A function:
maps points in to points in .
Its derivative maps tangent vectors at to tangent vectors at :
Forward mode computes this map operationally.
In coordinates, this is exactly:
This geometric view clarifies why forward mode moves in the same direction as computation.
Summary
Tangent propagation is the core mechanism of forward mode.
It evaluates each variable as:
and applies local tangent rules in program order.
Forward mode computes Jacobian-vector products, supports ordinary control flow naturally, and uses little memory. Its cost is proportional to the number of tangent directions being propagated.