A minimal forward mode automatic differentiation engine has one job: evaluate a program while carrying both a value and its derivative. The engine does not build a graph. It...
| Section | Title |
|---|---|
| 1 | Chapter 20. Building an AD Engine |
| 2 | Minimal Reverse Mode Engine |
| 3 | Graph Representation |
| 4 | Tape Design |
| 5 | Memory Management |
| 6 | Operator Libraries |
| 7 | Custom Gradients |
| 8 | Performance Benchmarking |
| 9 | Testing Derivatives |
| 10 | Production Deployment |
Chapter 20. Building an AD EngineA minimal forward mode automatic differentiation engine has one job: evaluate a program while carrying both a value and its derivative. The engine does not build a graph. It...
Minimal Reverse Mode EngineReverse mode automatic differentiation computes derivatives by traversing the program backward after evaluation. Unlike forward mode, which propagates tangents alongside...
Graph RepresentationA graph representation makes the structure of a differentiated computation explicit. In reverse mode, this structure is required because the backward pass must know which...
Tape DesignA tape is an append-only record of the operations executed during the forward pass. Reverse mode uses the tape to replay derivative rules backward.
Memory ManagementMemory management is the main systems problem in reverse mode automatic differentiation. The derivative rules are usually small. The hard part is deciding which primal values,...
Operator LibrariesAn automatic differentiation engine becomes useful only after it supports a sufficiently rich set of primitive operations. The collection of these primitives is the operator...
Custom GradientsA custom gradient gives the user direct control over the backward rule of an operation. The forward computation still produces an ordinary value, but the derivative no longer...
Performance BenchmarkingPerformance benchmarking measures whether an automatic differentiation engine is fast, memory-efficient, and scalable under realistic workloads. It also protects the engine...
Testing DerivativesAn automatic differentiation engine is only useful if its derivatives are correct. A small mistake in a backward rule can silently corrupt optimization, training, or...
Production DeploymentA minimal automatic differentiation engine can compute correct gradients on small programs. A production system must survive long-running workloads, large tensors, distributed...