Robotics and control systems interact with the physical world through sensing, estimation, planning, and actuation. Automatic differentiation is important because modern...
Robotics and control systems interact with the physical world through sensing, estimation, planning, and actuation. Automatic differentiation is important because modern control pipelines increasingly depend on optimization, simulation, system identification, and differentiable models.
A robotic system is often represented as a dynamical system:
where:
| Symbol | Meaning |
|---|---|
| system state | |
| control input | |
| physical or model parameters |
A controller chooses actions
to optimize some objective.
The objective may involve trajectory tracking, energy consumption, stability, collision avoidance, or task completion. AD computes derivatives needed for optimization, planning, estimation, and learning.
Dynamics as Differentiable Programs
A robot simulator is a computational graph:
state
-> dynamics
-> integration
-> contact resolution
-> sensor model
-> cost functionDifferentiating this graph gives sensitivities of trajectories and objectives with respect to controls, parameters, or initial conditions.
This enables:
- trajectory optimization,
- model predictive control,
- differentiable simulation,
- policy learning,
- system identification,
- calibration,
- inverse dynamics.
Equations of Motion
Rigid-body dynamics are commonly written as
where:
| Symbol | Meaning |
|---|---|
| generalized coordinates | |
| mass matrix | |
| Coriolis and centrifugal terms | |
| gravity | |
| generalized forces |
The state is typically
A simulator numerically integrates the resulting ODE or DAE.
AD computes derivatives of trajectories or costs with respect to:
- torques,
- masses,
- inertias,
- geometry,
- friction coefficients,
- controller parameters.
Trajectory Optimization
Trajectory optimization searches for a control sequence minimizing a cost:
subject to dynamics constraints
The gradient of the total objective depends on derivatives propagated through the dynamics.
Reverse-mode AD naturally computes these gradients because the trajectory is a sequential computation graph.
Optimal Control and Adjoint Equations
Continuous-time optimal control uses dynamics
and cost
Pontryagin’s principle introduces the adjoint state
The Hamiltonian is
The adjoint equation is
This is mathematically equivalent to reverse-mode differentiation through the trajectory.
The connection between control theory and reverse-mode AD is deep:
| Control theory | AD interpretation |
|---|---|
| Adjoint state | Reverse gradient |
| Costate equation | Backward sensitivity propagation |
| Hamiltonian derivatives | Local Jacobian actions |
| Shooting method | Gradient-based trajectory optimization |
Model Predictive Control
Model predictive control repeatedly solves a finite-horizon optimization problem:
- estimate current state,
- optimize future controls,
- apply first action,
- repeat.
Each optimization uses system derivatives.
AD is useful because modern MPC systems may contain:
- nonlinear dynamics,
- learned components,
- differentiable constraints,
- neural cost terms,
- differentiable collision models.
Instead of deriving gradients manually, the controller differentiates the simulation and objective directly.
Differentiable Simulation
A differentiable simulator exposes derivatives of simulated outcomes with respect to simulation inputs.
Suppose simulation evolves:
A differentiable simulator computes:
Applications include:
| Application | Purpose |
|---|---|
| Robot design | Optimize morphology |
| Policy learning | Differentiate reward |
| System identification | Fit physical parameters |
| Sim-to-real transfer | Adapt simulator parameters |
| Grasp optimization | Optimize contact behavior |
| Motion planning | Optimize trajectories |
Contact Dynamics
Contact is one of the hardest parts of differentiable robotics.
Rigid contact often introduces discontinuities:
where collisions instantaneously change velocity.
Friction introduces complementarity conditions:
These systems are piecewise smooth or non-smooth.
Naive AD through contact solvers may produce:
- undefined gradients,
- unstable sensitivities,
- zero gradients,
- discontinuous optimization behavior.
Common strategies include:
| Strategy | Idea |
|---|---|
| Soft contact | Replace hard contact with smooth penalty |
| Implicit differentiation | Differentiate converged contact solve |
| Relaxed complementarity | Smooth inequality conditions |
| Hybrid methods | Analytical contact derivatives |
Contact differentiation remains an active research area.
Inverse Kinematics
Inverse kinematics solves for joint angles producing a desired end-effector pose.
If
then inverse kinematics solves
Optimization form:
The Jacobian
maps joint velocities to task-space velocities:
AD computes these Jacobians automatically, especially for complex articulated systems.
State Estimation
Robotic systems estimate hidden states from sensor measurements.
A state estimator may combine:
- inertial sensors,
- cameras,
- lidar,
- wheel encoders,
- GPS,
- force sensors.
An optimization-based estimator minimizes residuals:
AD computes Jacobians needed for Gauss-Newton or Levenberg-Marquardt optimization.
This is especially useful in SLAM and visual-inertial odometry, where residual structures are large and sparse.
Differentiable Perception
Modern robotic pipelines often integrate learned perception systems.
Example:
camera image
-> neural perception model
-> object pose estimate
-> planner
-> controller
-> robot actionIf the pipeline is differentiable end-to-end, gradients can flow from task objectives back into perception modules.
This enables:
- task-aware perception,
- differentiable sensor calibration,
- policy gradients through perception,
- learned observation models.
System Identification
System identification estimates physical parameters from observed trajectories.
Suppose a simulator predicts
Observed trajectories are
The objective is
AD computes
This allows fitting:
- masses,
- friction coefficients,
- motor constants,
- damping,
- actuator delays,
- aerodynamic parameters.
Differentiable simulation has become a major tool for simulator calibration.
Reinforcement Learning and Control
Many reinforcement learning systems are differentiable control systems.
A policy
interacts with dynamics:
The objective is expected return:
If the environment is differentiable, gradients can propagate directly through the dynamics. This often gives lower-variance updates than score-function estimators.
However, real environments contain discontinuities, stochasticity, and unmodeled effects. Pure differentiable control is therefore usually combined with robust or stochastic methods.
Sparse Structure
Robotic systems have structured Jacobians.
Examples:
| Structure | Consequence |
|---|---|
| Kinematic trees | Block-sparse derivatives |
| Local contacts | Sparse coupling |
| Sequential dynamics | Banded time structure |
| Factor graphs | Sparse estimation systems |
Efficient robotics AD systems exploit this sparsity rather than forming dense Jacobians.
Real-Time Constraints
Control systems often run under strict timing constraints.
| Application | Typical timing |
|---|---|
| Motor control | microseconds to milliseconds |
| MPC | milliseconds |
| Flight control | sub-millisecond stability loops |
| SLAM updates | real-time sensor rates |
Generic AD frameworks may be too slow or memory-heavy.
Production systems often use:
- custom derivative kernels,
- ahead-of-time code generation,
- symbolic simplification,
- sparse linear algebra,
- static computational graphs.
The derivative system must fit real-time constraints.
Numerical Stability
Robotics gradients can become unstable because of:
| Cause | Effect |
|---|---|
| Chaotic contact sequences | Sensitive trajectories |
| Long horizons | Exploding or vanishing gradients |
| Poor scaling | Ill-conditioned optimization |
| Hard constraints | Non-smooth derivatives |
| Integrator error | Gradient mismatch |
| Solver tolerances | Noisy adjoints |
Good differentiable robotics systems carefully define solver semantics and smoothing behavior.
Differentiable Robot Design
Robot morphology itself can become an optimization variable.
Parameters may include:
- link lengths,
- masses,
- actuator placement,
- joint limits,
- sensor locations.
A differentiable simulator computes how morphology affects task performance.
Optimization becomes:
where now defines robot structure rather than controller parameters.
This creates co-design systems where body and controller are optimized jointly.
Practical Architecture
A robust differentiable robotics stack typically separates:
| Layer | Responsibility |
|---|---|
| Geometry | Kinematics and transforms |
| Dynamics | Equations of motion |
| Contact | Collision and friction |
| Integration | Time stepping |
| Estimation | Sensor fusion and optimization |
| Planning | Trajectory optimization |
| Control | Policy or feedback law |
| Learning | Gradient-based adaptation |
Each layer should expose well-defined derivative rules.
Failure Modes
Differentiable robotics systems fail in characteristic ways.
| Failure mode | Cause |
|---|---|
| Exploding trajectory gradients | Long unstable horizons |
| Zero contact gradients | Hard collision thresholds |
| Simulator mismatch | Real-world physics differs |
| Memory explosion | Reverse mode through long trajectories |
| Unstable optimization | Ill-conditioned dynamics |
| Nonphysical learned behavior | Weak constraints |
| Timing failure | AD overhead violates real-time limits |
Many practical systems intentionally smooth dynamics or truncate gradients to maintain optimization stability.
Summary
Robotics and control systems are naturally differentiable because they evolve through structured dynamical equations. Automatic differentiation provides gradients for trajectory optimization, control, estimation, simulation, and learning.
The main challenges are contact discontinuities, long-horizon stability, sparse structure, real-time execution, and differentiating through numerical solvers. Effective systems combine AD with optimal control theory, sparse numerical methods, custom solver derivatives, and carefully designed simulation semantics.