Gradient descent is the basic optimization procedure behind much of modern machine learning. It is simple enough to state in one line, but rich enough to expose many of the...
| Section | Title |
|---|---|
| 1 | Chapter 13. Optimization and Machine Learning |
| 2 | Stochastic Optimization |
| 3 | Backpropagation |
| 4 | Neural Network Training |
| 5 | Sequence Models |
| 6 | Attention Mechanisms |
| 7 | Implicit Layers |
| 8 | Meta-Learning |
| 9 | Reinforcement Learning |
| 10 | Physics-Informed Models |
Chapter 13. Optimization and Machine LearningGradient descent is the basic optimization procedure behind much of modern machine learning. It is simple enough to state in one line, but rich enough to expose many of the...
Stochastic OptimizationStochastic optimization studies optimization when the objective is accessed through samples, noisy estimates, or partial observations. In machine learning, this is the normal...
BackpropagationBackpropagation is reverse mode automatic differentiation applied to neural networks. In most machine learning writing, the term refers to the whole training procedure: run a...
Neural Network TrainingNeural network training is the repeated application of three operations: evaluate a model, differentiate a scalar loss, and update parameters. Automatic differentiation...
Sequence ModelsSequence models process ordered data. The input is not one independent vector, but a series:
Attention MechanismsAttention is a sequence operation that lets each position read information from other positions. Instead of compressing the whole past into one recurrent hidden state,...
Implicit LayersAn implicit layer defines its output as the solution of an equation, not as a fixed sequence of explicit operations. Instead of computing
Meta-LearningMeta-learning studies systems that improve how they learn. Instead of only optimizing model parameters for one task, a meta-learning method optimizes some part of the learning...
Reinforcement LearningReinforcement learning studies learning systems that act in an environment. Unlike supervised learning, the training signal is not a target label for each input. The model...
Physics-Informed ModelsPhysics-informed models combine data fitting with equations from physics or applied mathematics. The model is trained not only to match observed samples, but also to satisfy...