Truncated Polynomial Algebras

Dual numbers capture first-order derivatives because the infinitesimal element satisfies

\varepsilon^2 = 0.

To compute higher-order derivatives, we extend this idea by allowing higher powers to survive temporarily before truncation.

The resulting structures are called truncated polynomial algebras.

They generalize dual numbers from first-order infinitesimals to finite-order infinitesimal expansions.

From Dual Numbers to Higher Orders

Recall the dual-number algebra:

\mathbb{R}[\varepsilon]/(\varepsilon^2).

Every element has the form

a_0 + a_1\varepsilon.

The condition

\varepsilon^2 = 0

removes all second-order and higher terms.

Now instead consider:

\mathbb{R}[\varepsilon]/(\varepsilon^{k+1}).

Here powers up to $\varepsilon^k$ survive:

a_0 + a_1\varepsilon + a_2\varepsilon^2 + \cdots + a_k\varepsilon^k.

Only terms of degree $k+1$ and higher vanish.

This algebra stores derivatives up to order $k$ .

Example: Second-Order Algebra

The simplest nontrivial extension is:

\mathbb{R}[\varepsilon]/(\varepsilon^3).

Elements have the form:

a + b\varepsilon + c\varepsilon^2.

Now:

\varepsilon^3 = 0,

but:

\varepsilon^2 \neq 0.

Multiplication works by ordinary polynomial multiplication followed by truncation.

For example:

(a+b\varepsilon+c\varepsilon^2) (d+e\varepsilon+f\varepsilon^2)

expands to:

ad + (ae+bd)\varepsilon + (af+be+cd)\varepsilon^2 + \text{higher-order terms}.

All powers containing $\varepsilon^3$ or above vanish.

Taylor Expansion Interpretation

The key connection comes from Taylor expansion.

For a smooth function:

f(x+h) = f(x) + f'(x)h + \frac12 f''(x)h^2 + \cdots.

Substitute a truncated infinitesimal:

h = \varepsilon, \quad \varepsilon^{k+1}=0.

Then:

f(x+\varepsilon) = f(x) + f'(x)\varepsilon + \frac12 f''(x)\varepsilon^2 + \cdots + \frac1{k!}f^{(k)}(x)\varepsilon^k.

The series terminates exactly.

Higher-order derivatives become coefficients in the algebra.

Second Derivative Example

Let:

f(x)=x^3.

Use the algebra:

\varepsilon^3=0.

Evaluate:

(x+\varepsilon)^3.

Expanding:

= x^3 + 3x^2\varepsilon + 3x\varepsilon^2 + \varepsilon^3.

Since:

\varepsilon^3=0,

we obtain:

x^3 + 3x^2\varepsilon + 3x\varepsilon^2.

Compare with Taylor expansion:

f(x+\varepsilon) = f(x) + f'(x)\varepsilon + \frac12 f''(x)\varepsilon^2.

Since:

f'(x)=3x^2,

and

f''(x)=6x,

we get:

\frac12 f''(x)=3x.

The coefficients match exactly.

Structure of the Algebra

The truncated polynomial algebra

\mathbb{R}[\varepsilon]/(\varepsilon^{k+1})

is:

commutative
finite-dimensional
associative
local
nilpotent

Its basis is:

1,\varepsilon,\varepsilon^2,\ldots,\varepsilon^k.

Dimension is:

k+1.

Every element is represented uniquely as:

\sum_{i=0}^{k} a_i\varepsilon^i.

Arithmetic remains finite because powers terminate.

First-Order Versus Higher-Order AD

Dual numbers compute first derivatives:

f(x+\varepsilon) = f(x)+f'(x)\varepsilon.

Truncated polynomial algebras compute finite Taylor expansions:

f(x+\varepsilon) = \sum_{i=0}^{k} \frac{f^{(i)}(x)}{i!}\varepsilon^i.

Thus:

Algebra	Information Stored
$\varepsilon^2=0$	First derivative
$\varepsilon^3=0$	Second derivative
$\varepsilon^4=0$	Third derivative
$\varepsilon^{k+1}=0$	$k$ -th derivatives

Forward-mode higher-order AD is therefore polynomial propagation over truncated algebras.

Polynomial Propagation Through Programs

Suppose a program computes:

f(x)=\sin(x^2).

Represent input as:

x = a+\varepsilon.

Each intermediate variable becomes a truncated polynomial.

First compute:

x^2 = a^2 + 2a\varepsilon + \varepsilon^2.

Then evaluate sine:

\sin(a^2 + 2a\varepsilon + \varepsilon^2).

Taylor expansion automatically propagates derivative information through every intermediate computation.

Programs become finite Taylor-series transformers.

Jet Spaces

In differential geometry, truncated polynomial algebras correspond to jet spaces.

A $k$ -jet stores derivatives up to order $k$ .

Two smooth functions are equivalent at a point if their derivatives agree through order $k$ .

The truncated algebra captures exactly this finite local behavior.

For example:

dual numbers correspond to first jets
cubic truncations correspond to second jets
higher truncations correspond to higher jets

Automatic differentiation over truncated algebras therefore computes finite jets of programs.

Multiple Variables

For multivariate functions, introduce several infinitesimal generators:

\varepsilon_1,\varepsilon_2,\ldots,\varepsilon_n.

A second-order algebra may satisfy:

\varepsilon_i^3=0

or:

\varepsilon_i^2=0

while mixed products survive.

A general element becomes:

a + \sum_i b_i\varepsilon_i + \sum_{i,j} c_{ij}\varepsilon_i\varepsilon_j.

The coefficients encode:

first derivatives
mixed partial derivatives
Hessian structure

This forms the basis of multivariate higher-order AD.

Example: Mixed Partial Derivatives

Let:

f(x,y)=xy+\sin(xy).

Represent:

x = a+\varepsilon_1

y = b+\varepsilon_2.

Then:

xy = ab + b\varepsilon_1 + a\varepsilon_2 + \varepsilon_1\varepsilon_2.

The mixed product term survives.

Expanding the full function reveals coefficients involving:

\varepsilon_1\varepsilon_2.

Those coefficients encode:

\frac{\partial^2 f}{\partial x \partial y}.

Higher-order structure emerges directly from algebraic multiplication.

Computational Complexity

Higher-order propagation becomes expensive because coefficient counts grow rapidly.

For order $k$ in $n$ variables, the number of monomials is:

\binom{n+k}{k}.

Growth is combinatorial.

For example:

Variables	Order	Number of Terms
1	2	3
2	2	6
5	3	56
10	4	1001

This combinatorial explosion is one of the main challenges in higher-order automatic differentiation.

Sparse and Structured Representations

Practical systems rarely store full dense polynomial expansions.

Instead they exploit:

sparsity
symmetry
directional derivatives
compressed Hessians
low-rank structure

Methods include:

truncated Taylor mode
hyper-dual numbers
Hessian-vector products
checkpointed higher-order reverse mode

The algebraic viewpoint remains the same even when representations become optimized.

Relation to Formal Power Series

Truncated polynomial algebras are finite approximations to formal power series.

A formal power series has infinite expansion:

\sum_{i=0}^{\infty} a_i\varepsilon^i.

Truncated algebras cut this off at finite order:

\sum_{i=0}^{k} a_i\varepsilon^i.

Automatic differentiation computes finite local expansions rather than full analytic series.

This finite truncation keeps computation practical.

Program Semantics View

Ordinary execution interprets variables as real numbers:

x \in \mathbb{R}.

First-order forward AD interprets them as dual numbers:

x \in \mathbb{R}[\varepsilon]/(\varepsilon^2).

Higher-order AD interprets them as truncated polynomials:

x \in \mathbb{R}[\varepsilon]/(\varepsilon^{k+1}).

The program structure remains unchanged.

Only the underlying algebra changes.

This is one of the deepest ideas in automatic differentiation: differentiation can be implemented by changing the algebra of evaluation.

Algebraic Summary

Truncated polynomial algebras generalize dual numbers by preserving higher-order infinitesimal terms.

The algebra:

\mathbb{R}[\varepsilon]/(\varepsilon^{k+1})

stores finite Taylor expansions up to order $k$ .

Evaluating a function in this algebra produces:

f(x+\varepsilon) = \sum_{i=0}^{k} \frac{f^{(i)}(x)}{i!}\varepsilon^i.

Thus:

coefficients become derivatives
multiplication propagates chain rules
higher-order interactions emerge naturally
differentiation becomes algebraic evaluation

Higher-order automatic differentiation is therefore computation over truncated polynomial algebras.