Dual numbers give the cleanest algebraic model of forward mode automatic differentiation. They extend ordinary real numbers with a formal infinitesimal part. Instead of...
Algebra of Dual Numbers
Dual numbers give the cleanest algebraic model of forward mode automatic differentiation. They extend ordinary real numbers with a formal infinitesimal part. Instead of carrying only a value, a dual number carries a value and its first-order variation.
A dual number has the form
where , and is a formal element satisfying
but
The element behaves like an infinitesimal direction. It is not a small real number. It is an algebraic marker that records first-order change and automatically deletes all second-order terms.
The Basic Algebra
Let
and
Addition is componentwise:
Multiplication follows ordinary distributivity, together with the rule :
So the product rule is built into the multiplication law:
This is already the core of automatic differentiation. The first component stores the primal value. The second component stores the derivative information.
Dual Numbers as Value-Derivative Pairs
In forward mode AD, we evaluate a function on a dual input
More generally, if the input is seeded with tangent , we write
For a smooth scalar function , Taylor expansion gives
Since , all higher-order terms vanish:
Thus a single evaluation over dual numbers computes both the value and the directional derivative.
The rule is:
For one input, choosing gives the ordinary derivative:
Example
Let
Evaluate it at :
So
and
Checking directly:
The derivative appears as the coefficient of .
Why Matters
The rule is what makes dual numbers represent first-order calculus. When multiplying perturbations, any second-order term disappears.
For example:
The coefficient of is exactly the derivative of at , applied to direction :
This pattern holds for every smooth elementary operation used in a program. Dual arithmetic forces each operation to carry both its value and its local linearization.
Division and Inverses
A dual number has a multiplicative inverse when .
We seek
The product must equal :
So
and
Hence
and
Therefore
This corresponds to the derivative rule
Division follows from multiplication by the inverse:
Elementary Functions
Elementary functions extend naturally to dual numbers. For a smooth function ,
This gives direct evaluation rules.
For sine:
For cosine:
For exponential:
For logarithm, assuming :
For powers:
Each rule has the same shape: compute the primal value, then multiply the local derivative by the incoming tangent.
Dual Numbers and the Chain Rule
The chain rule is not added as an external algorithm. It follows from function composition over dual numbers.
Let
Evaluate at
First apply :
Then apply :
Therefore
This is exactly the chain rule.
Dual numbers turn the chain rule into ordinary evaluation. A program written over real numbers can often be lifted to dual numbers by replacing each primitive operation with its dual-number version.
Computational Interpretation
In an implementation, a dual number is usually represented as a pair:
type Dual struct {
Val float64
Dot float64
}Here Val is the primal value, and Dot is the tangent.
Addition:
func Add(x, y Dual) Dual {
return Dual{
Val: x.Val + y.Val,
Dot: x.Dot + y.Dot,
}
}Multiplication:
func Mul(x, y Dual) Dual {
return Dual{
Val: x.Val * y.Val,
Dot: x.Dot*y.Val + x.Val*y.Dot,
}
}Sine:
func Sin(x Dual) Dual {
return Dual{
Val: math.Sin(x.Val),
Dot: math.Cos(x.Val) * x.Dot,
}
}A function written against these operations computes derivatives automatically.
For example:
func F(x Dual) Dual {
return Add(Mul(Mul(x, x), x), Mul(Const(2), x))
}With input
x := Dual{Val: 5, Dot: 1}the result is
Dual{Val: 135, Dot: 77}The same execution computes the primal value and the derivative.
Multiple Inputs
For a function
a dual number can propagate one directional derivative at a time. Each input receives a primal value and a tangent seed.
For example, let
To compute the derivative in direction
evaluate
and
Then
and
So
The coefficient of is
Equivalently,
Forward mode naturally computes Jacobian-vector products.
Relation to Forward Mode AD
Forward mode AD is dual-number evaluation generalized to programs.
Each program variable carries two components:
Each primitive instruction updates both components.
For a program statement
the lifted statement is
For
the lifted statement is
This local transformation is enough. The global derivative emerges from executing the transformed program.
Algebraic Summary
The dual numbers form a commutative algebra over the real numbers:
This notation means: take polynomials in , but identify every term containing or higher powers with zero.
Every dual number has a unique form:
The real part is the value. The dual part is the first-order coefficient.
This small algebra is powerful because it encodes first-order differential calculus directly into arithmetic. In forward mode AD, differentiation becomes evaluation in the algebra of dual numbers.