# Hyper-Dual Numbers

## Hyper-Dual Numbers

Dual numbers compute first derivatives exactly. Truncated polynomial algebras extend this to higher-order derivatives, but practical higher-order differentiation introduces an important problem: extracting second derivatives accurately without symbolic expansion or numerical cancellation.

Hyper-dual numbers solve this problem by introducing multiple nilpotent infinitesimal directions whose mixed products survive.

They provide an exact algebraic mechanism for computing:

- second derivatives
- mixed partial derivatives
- Hessians

without finite differences and without truncation error.

### Motivation

Ordinary dual numbers satisfy:

$$
\varepsilon^2 = 0.
$$

Evaluating

$$
f(x+\varepsilon)
$$

produces:

$$
f(x)+f'(x)\varepsilon.
$$

Only first-order information survives.

To recover second derivatives, one possibility is nested dual numbers or truncated polynomial algebras. However, those approaches may:

- increase implementation complexity
- require managing higher polynomial coefficients
- introduce perturbation confusion in nested systems

Hyper-dual numbers provide a cleaner construction for exact second-order differentiation.

### The Hyper-Dual Algebra

Introduce two independent infinitesimal generators:

$$
\varepsilon_1,\varepsilon_2.
$$

Require:

$$
\varepsilon_1^2 = 0
$$

$$
\varepsilon_2^2 = 0.
$$

But preserve the mixed product:

$$
\varepsilon_1\varepsilon_2 \neq 0.
$$

Also:

$$
(\varepsilon_1\varepsilon_2)^2 = 0.
$$

A hyper-dual number has the form:

$$
a
+
b\varepsilon_1
+
c\varepsilon_2
+
d\varepsilon_1\varepsilon_2.
$$

This algebra stores:

| Component | Meaning |
|---|---|
| $a$ | primal value |
| $b$ | first derivative in direction 1 |
| $c$ | first derivative in direction 2 |
| $d$ | mixed second derivative |

### Why Mixed Products Matter

The key idea is that:

$$
(\varepsilon_1+\varepsilon_2)^2 =
2\varepsilon_1\varepsilon_2.
$$

The square does not vanish completely because cross terms survive.

This allows second-order information to appear algebraically.

### Taylor Expansion

For a smooth scalar function:

$$
f(x+h),
$$

the second-order Taylor expansion is:

$$
f(x+h) =
f(x)
+
f'(x)h
+
\frac12 f''(x)h^2.
$$

Now substitute:

$$
h = a\varepsilon_1 + b\varepsilon_2.
$$

Since:

$$
\varepsilon_1^2 = \varepsilon_2^2 = 0,
$$

the square becomes:

$$
h^2 =
2ab\varepsilon_1\varepsilon_2.
$$

Thus:

$$
f(x+h) =
f(x)
+
f'(x)(a\varepsilon_1+b\varepsilon_2)
+
f''(x)ab\varepsilon_1\varepsilon_2.
$$

The coefficient of:

$$
\varepsilon_1\varepsilon_2
$$

is exactly the second derivative.

### Example

Let:

$$
f(x)=x^3.
$$

Use the hyper-dual input:

$$
x+\varepsilon_1+\varepsilon_2.
$$

Expand:

$$
(x+\varepsilon_1+\varepsilon_2)^3.
$$

First compute:

$$
(x+h)^3 =
x^3 + 3x^2h + 3xh^2 + h^3.
$$

Since:

$$
h=\varepsilon_1+\varepsilon_2,
$$

and:

$$
h^2 = 2\varepsilon_1\varepsilon_2,
$$

while:

$$
h^3=0,
$$

we obtain:

$$
x^3
+
3x^2(\varepsilon_1+\varepsilon_2)
+
6x\varepsilon_1\varepsilon_2.
$$

Thus:

| Coefficient | Value |
|---|---|
| $1$ | $x^3$ |
| $\varepsilon_1$ | $3x^2$ |
| $\varepsilon_2$ | $3x^2$ |
| $\varepsilon_1\varepsilon_2$ | $6x$ |

Since:

$$
f''(x)=6x,
$$

the mixed coefficient gives the exact second derivative.

### Multivariable Functions

Hyper-dual numbers naturally extend to multivariate functions.

Suppose:

$$
f : \mathbb{R}^n \to \mathbb{R}.
$$

Choose two perturbation directions:

$$
u,v \in \mathbb{R}^n.
$$

Evaluate:

$$
x + u\varepsilon_1 + v\varepsilon_2.
$$

Then:

$$
f(x+u\varepsilon_1+v\varepsilon_2)
$$

expands to:

$$
f(x)
+
Df_x(u)\varepsilon_1
+
Df_x(v)\varepsilon_2
+
u^T H_x v \,
\varepsilon_1\varepsilon_2.
$$

The mixed coefficient gives the Hessian bilinear form:

$$
u^T H_x v.
$$

This computes exact second-order directional derivatives.

### Hessian Extraction

To compute a Hessian entry:

$$
\frac{\partial^2 f}{\partial x_i \partial x_j},
$$

seed:

$$
u=e_i,
\quad
v=e_j.
$$

Then the coefficient of:

$$
\varepsilon_1\varepsilon_2
$$

is exactly:

$$
H_{ij}.
$$

Repeated evaluation recovers the full Hessian matrix.

### Example: Two Variables

Let:

$$
f(x,y)=x^2y+\sin(xy).
$$

Choose perturbations:

$$
x \mapsto x+\varepsilon_1
$$

$$
y \mapsto y+\varepsilon_2.
$$

Then:

$$
xy =
xy + y\varepsilon_1 + x\varepsilon_2 + \varepsilon_1\varepsilon_2.
$$

Mixed terms appear automatically.

Expanding the entire function produces coefficients involving:

$$
\varepsilon_1\varepsilon_2,
$$

which equal:

$$
\frac{\partial^2 f}{\partial x\partial y}.
$$

No symbolic differentiation is needed.

### Exactness

Hyper-dual differentiation is exact up to floating-point arithmetic.

Unlike finite differences:

| Method | Error Source |
|---|---|
| Finite differences | truncation + cancellation |
| Symbolic differentiation | expression explosion |
| Hyper-dual numbers | floating-point only |

No step size is required.

No subtraction cancellation occurs.

The derivative structure emerges algebraically.

### Algebraic Structure

The hyper-dual algebra can be written:

$$
\mathbb{R}[\varepsilon_1,\varepsilon_2]
/
(\varepsilon_1^2,\varepsilon_2^2).
$$

Basis elements are:

$$
1,
\varepsilon_1,
\varepsilon_2,
\varepsilon_1\varepsilon_2.
$$

Dimension is four.

Multiplication rules:

| Product | Result |
|---|---|
| $\varepsilon_1^2$ | $0$ |
| $\varepsilon_2^2$ | $0$ |
| $\varepsilon_1\varepsilon_2$ | survives |
| $(\varepsilon_1\varepsilon_2)^2$ | $0$ |

This carefully chosen nilpotent structure isolates second-order interactions.

### Computational Interpretation

A hyper-dual number may be represented as:

```go
type HyperDual struct {
    Val  float64
    D1   float64
    D2   float64
    D12  float64
}
```

Components represent:

| Field | Meaning |
|---|---|
| `Val` | primal value |
| `D1` | first derivative along direction 1 |
| `D2` | first derivative along direction 2 |
| `D12` | mixed second derivative |

### Multiplication Rule

Suppose:

$$
x=(a,b,c,d)
$$

and

$$
y=(p,q,r,s).
$$

Then multiplication becomes:

$$
xy=
(
ap,
aq+bp,
ar+cp,
as+br+cq+dp
).
$$

The mixed term obeys the second-order product rule automatically.

### Example Implementation

```go
func Mul(x, y HyperDual) HyperDual {
    return HyperDual{
        Val: x.Val * y.Val,

        D1:
            x.D1*y.Val +
            x.Val*y.D1,

        D2:
            x.D2*y.Val +
            x.Val*y.D2,

        D12:
            x.D12*y.Val +
            x.D1*y.D2 +
            x.D2*y.D1 +
            x.Val*y.D12,
    }
}
```

The `D12` component contains all mixed second-order interactions.

### Relation to Hessian-Vector Products

Hyper-dual numbers compute second-order directional derivatives naturally.

Given:

$$
u^T H v,
$$

evaluate:

$$
x + u\varepsilon_1 + v\varepsilon_2.
$$

The coefficient of:

$$
\varepsilon_1\varepsilon_2
$$

is the result.

This avoids explicit Hessian construction.

For large systems, Hessian-vector products are often preferable to dense Hessians.

### Perturbation Confusion

Nested dual-number systems may accidentally mix perturbation symbols.

Hyper-dual numbers avoid this by explicitly separating infinitesimal generators:

$$
\varepsilon_1,
\varepsilon_2.
$$

Each perturbation direction remains algebraically distinct.

This improves correctness in higher-order implementations.

### Relation to Truncated Polynomial Algebras

Hyper-dual numbers differ from ordinary truncated polynomial algebras.

Truncated polynomial algebra:

$$
\mathbb{R}[\varepsilon]/(\varepsilon^3)
$$

keeps powers:

$$
1,\varepsilon,\varepsilon^2.
$$

Hyper-dual algebra instead keeps:

$$
1,
\varepsilon_1,
\varepsilon_2,
\varepsilon_1\varepsilon_2.
$$

This distinction matters:

| Structure | Stores |
|---|---|
| Truncated polynomial | repeated derivatives |
| Hyper-dual | mixed derivatives |

Hyper-dual systems are particularly effective for Hessian computation.

### Complexity

For $n$ variables:

- one forward dual pass computes one directional derivative
- one hyper-dual pass computes one second-order directional interaction

Dense Hessian construction still requires multiple evaluations.

However, the method remains exact and compositional.

### Geometric Interpretation

Dual numbers represent tangent vectors.

Hyper-dual numbers represent interacting tangent directions.

The mixed product:

$$
\varepsilon_1\varepsilon_2
$$

captures curvature.

First-order infinitesimals describe local linear geometry.

Second-order mixed infinitesimals describe local quadratic geometry.

Hyper-dual numbers therefore encode second-order local structure.

### Summary

Hyper-dual numbers extend dual numbers by introducing multiple independent nilpotent directions whose mixed products survive.

The algebra:

$$
\mathbb{R}[\varepsilon_1,\varepsilon_2]
/
(\varepsilon_1^2,\varepsilon_2^2)
$$

produces exact second derivatives through ordinary program evaluation.

Key properties:

| Feature | Result |
|---|---|
| Independent infinitesimals | separate derivative directions |
| Mixed products survive | second-order information |
| No finite differences | exact differentiation |
| Local algebraic propagation | automatic Hessian computation |
| Structured nilpotency | stable higher-order AD |

Hyper-dual numbers provide one of the cleanest exact formulations of second-order automatic differentiation.

