Chapter 23 – brain

Sections

Pretraining ObjectivesA large language model is trained in two broad phases. The first phase is pretraining.

Scaling Laws for Language ModelsScaling laws describe how model performance changes as we increase compute, parameter count, dataset size, and training tokens.

Instruction TuningPretraining teaches a language model to predict text. It does not directly teach the model to follow user instructions, answer safely, maintain dialogue structure, or format outputs in a useful way.

Reinforcement Learning from Human FeedbackInstruction tuning teaches a model to imitate demonstrations.

Constitutional AlignmentReinforcement learning from human feedback improves model behavior using preference data. However, collecting large amounts of human feedback is expensive, slow, and difficult to scale consistently.

In-Context LearningLarge language models can often perform new tasks without updating their parameters.

Tool Use and AgentsA language model becomes more useful when it can interact with external systems.

Retrieval-Augmented GenerationRetrieval-augmented generation, usually abbreviated RAG, combines a language model with an external information retrieval system.