The main evaluation loop in Python/ceval.c, opcode dispatch via computed gotos, and the eval breaker mechanism.
The evaluation loop is the central execution engine of CPython. It takes a compiled code object, executes its bytecode instructions, and produces a result or an exception.
At a high level, CPython execution looks like this:
Python source
↓
tokens
↓
parser
↓
AST
↓
symbol table
↓
compiler
↓
code object
↓
frame
↓
evaluation loop
↓
Python result or exceptionThis chapter focuses on the last active stage: the loop that runs bytecode.
The evaluation loop lives in CPython’s interpreter implementation. Historically, the key file has been Python/ceval.c, with surrounding interpreter machinery spread across other files. Modern CPython also generates some interpreter code from bytecode definitions. The details move between releases, but the model remains stable: a frame executes a code object by repeatedly dispatching bytecode instructions. CPython’s own developer guide points to the internal documentation and source tree as the current reference because this code changes across versions.
27.1 The Job of the Evaluation Loop
The evaluation loop does not parse Python text. It does not build the AST. It does not usually decide lexical scope. Those jobs are already finished by the time execution begins.
Its job is narrower and more mechanical:
read the next bytecode instruction
decode its operand
perform the operation
update the frame
continue, jump, call, return, or raiseFor example, this function:
def add(a, b):
return a + bis compiled into a code object. The code object contains bytecode. When add(2, 3) is called, CPython creates or initializes a frame for that call, stores the arguments in fast local slots, then runs the frame through the evaluation loop.
The loop eventually reaches a return instruction. That instruction pops or reads the return value, unwinds the frame, and returns the object pointer to the caller.
Conceptually:
call add(2, 3)
create frame
store a = 2
store b = 3
execute LOAD_FAST a
execute LOAD_FAST b
execute BINARY_OP +
execute RETURN_VALUE
return 5The real implementation is more complex, but this is the core model.
27.2 The Main Runtime Objects
The evaluation loop connects several CPython runtime objects.
| Runtime object | Role |
|---|---|
| Code object | Immutable compiled bytecode and metadata |
| Frame | Mutable execution state for one call |
| Thread state | Per-thread interpreter execution state |
| Interpreter state | Per-interpreter runtime state |
| Python object | Runtime value manipulated by instructions |
| Type object | Runtime behavior table for Python objects |
A code object describes what should run.
A frame stores one active execution of that code.
The evaluation loop executes the frame.
This distinction matters. A single code object can be executed many times. Each call gets its own frame state.
def f(x):
return x + 1
a = f(10)
b = f(20)Both calls use the same code object, but each call has separate locals, stack state, and return value.
27.3 Code Objects
A code object contains the compiled representation of Python code.
You can inspect one from Python:
def f(x):
y = x + 1
return y
code = f.__code__
print(code.co_name)
print(code.co_varnames)
print(code.co_consts)
print(code.co_names)
print(code.co_stacksize)Typical fields include:
| Field | Meaning |
|---|---|
co_code | Bytecode stream, exposed in a version-dependent form |
co_consts | Literal constants used by the code |
co_names | Names referenced by bytecode |
co_varnames | Local variable names |
co_freevars | Free variables captured from outer scopes |
co_cellvars | Variables captured by inner scopes |
co_argcount | Positional argument count |
co_kwonlyargcount | Keyword-only argument count |
co_stacksize | Required value stack size |
co_flags | Code flags |
co_filename | Source filename |
co_name | Function or block name |
co_qualname | Qualified name |
co_firstlineno | First source line |
| line table | Mapping from bytecode offsets to source lines |
| exception table | Structured exception handling metadata |
The dis module exists specifically to inspect CPython bytecode. Its documentation notes that CPython bytecode is an implementation detail and may change between Python versions.
27.4 Frames
A frame is an execution record.
When a function runs, CPython needs somewhere to store:
the code object being executed
the instruction pointer
local variables
temporary stack values
globals dictionary
builtins dictionary
closure cells
exception state
return state
tracing and profiling stateThat structure is the frame.
A simplified frame model looks like this:
frame
code object
globals
builtins
locals / fast locals
instruction pointer
value stack
block and exception state
previous frame / caller relationA Python call chain creates a chain of frames:
def a():
return b()
def b():
return c()
def c():
return 42
a()Conceptually:
frame for a
frame for b
frame for cAt any instant, the current thread state points to the currently executing frame or equivalent internal frame representation.
27.5 Fast Locals
Function local variables are not normally stored in a normal dictionary during execution.
CPython uses an array-like layout for fast local variables. Names are resolved at compile time to local indexes. Bytecode instructions can then access local variables by index instead of doing dictionary lookup.
Example:
def f(a, b):
c = a + b
return cThe compiler assigns local slots:
| Name | Slot |
|---|---|
a | 0 |
b | 1 |
c | 2 |
The bytecode can then use slot-based operations:
LOAD_FAST 0 load a
LOAD_FAST 1 load b
BINARY_OP +
STORE_FAST 2 store c
LOAD_FAST 2 load c
RETURN_VALUEThis is why local variable access is generally faster than global variable access. A local access can use a direct frame slot. A global access must search dictionaries and handle builtins fallback.
27.6 The Value Stack
CPython bytecode uses a stack model.
Most instructions read from and write to a frame-local value stack. This stack is separate from the C call stack. It stores PyObject * values during bytecode execution.
For this expression:
x = (a + b) * cThe stack behavior is roughly:
LOAD_FAST a stack: [a]
LOAD_FAST b stack: [a, b]
BINARY_OP + stack: [a_plus_b]
LOAD_FAST c stack: [a_plus_b, c]
BINARY_OP * stack: [product]
STORE_FAST x stack: []The value stack is central to bytecode design. It avoids needing every instruction to name explicit source and destination registers. Instead, instructions agree on stack effects.
Some instructions push values:
LOAD_CONST
LOAD_FAST
LOAD_GLOBAL
BUILD_LISTSome instructions pop values:
STORE_FAST
POP_TOP
RETURN_VALUESome do both:
BINARY_OP
CALL
LOAD_ATTR
COMPARE_OP27.7 Instruction Pointer
The frame tracks where execution is inside the bytecode stream.
For straight-line code, the instruction pointer moves forward after each instruction.
For branches, loops, exception handling, and returns, instructions change control flow.
Example:
def sign(x):
if x < 0:
return -1
return 1Conceptual bytecode flow:
load x
load 0
compare <
jump if false to positive_return
load -1
return
positive_return:
load 1
returnThe instruction pointer is what makes this possible. A branch instruction changes the next instruction to execute.
27.8 Dispatch
Dispatch is the act of choosing the C implementation for the current bytecode instruction.
A simplified interpreter loop looks like this:
for (;;) {
opcode = read_opcode(frame);
oparg = read_operand(frame);
switch (opcode) {
case LOAD_FAST:
/* load local variable */
break;
case LOAD_CONST:
/* load constant */
break;
case BINARY_OP:
/* perform binary operation */
break;
case RETURN_VALUE:
/* return from frame */
break;
}
}This is only a teaching model. Modern CPython uses optimized dispatch techniques and generated interpreter code in places. Still, the essential shape remains:
fetch
decode
dispatch
execute
repeatDispatch cost matters. Every Python bytecode instruction passes through dispatch. If a loop executes millions of bytecode instructions, dispatch overhead becomes visible.
27.9 Stack Effects
Every bytecode instruction has a stack effect.
A stack effect describes how many values an instruction consumes and produces.
For example:
| Instruction | Input stack | Output stack |
|---|---|---|
LOAD_CONST | [] | [const] |
LOAD_FAST | [] | [local] |
STORE_FAST | [value] | [] |
BINARY_OP | [left, right] | [result] |
RETURN_VALUE | [value] | returns from frame |
The compiler must know stack effects to compute the maximum stack size required by a code object. That value appears as co_stacksize.
For:
def f(a, b, c):
return (a + b) * cThe stack never needs to hold more than two or three temporary values, depending on exact bytecode. CPython records the maximum required stack depth so the frame can reserve enough space.
27.10 Bytecode Operands
Many bytecode instructions have operands.
An operand is a small integer argument attached to the instruction. The meaning depends on the opcode.
Examples:
LOAD_CONST 0 load co_consts[0]
LOAD_FAST 1 load fast local slot 1
STORE_FAST 2 store into fast local slot 2
LOAD_GLOBAL 3 load name from name table index 3The bytecode instruction does not usually store a full pointer to the object or string name. It stores an index into a table owned by the code object.
This keeps bytecode compact and separates immutable metadata from execution state.
27.11 Running a Simple Function
Consider:
def add(a, b):
return a + bDisassembly may vary by Python version, but the conceptual instruction sequence is:
load local a
load local b
binary add
return valueExecution proceeds like this:
| Step | Instruction | Stack before | Stack after |
|---|---|---|---|
| 1 | LOAD_FAST a | [] | [a] |
| 2 | LOAD_FAST b | [a] | [a, b] |
| 3 | BINARY_OP + | [a, b] | [a + b] |
| 4 | RETURN_VALUE | [a + b] | return |
At the C level, each stack element is a PyObject *.
For add(2, 3), the stack holds pointers to Python integer objects. The addition operation dispatches through Python object semantics. It does not directly emit a CPU integer addition in the general case.
27.12 Why a + b Is Not Just One CPU Instruction
In Python, a + b is dynamic.
The objects may be integers:
1 + 2They may be strings:
"hello " + "world"They may be lists:
[1] + [2]They may be user-defined objects:
class X:
def __add__(self, other):
return "custom"
X() + X()The bytecode instruction for addition must respect Python’s data model. It must inspect the operand types, find the correct numeric or sequence operation, call special methods when needed, handle errors, and return a Python object.
So the evaluation loop cannot treat + as plain machine addition. It is a dynamic operation over Python objects.
Modern CPython reduces this overhead when it can. The specializing adaptive interpreter can specialize operations after observing stable runtime behavior. PEP 659 describes this as specialization over small regions with rapid adaptation when behavior changes.
27.13 Function Calls
Function calls are among the most important paths in the evaluation loop.
For:
result = f(x, y)The interpreter must:
load callable f
load arguments x and y
arrange call arguments
check callable type
enter optimized call path if possible
create or initialize callee frame if it is a Python function
execute callee frame
receive return value
continue caller frameConceptually:
caller frame
LOAD_FAST f
LOAD_FAST x
LOAD_FAST y
CALL 2
create callee frame
run callee frame
return object
STORE_FAST resultCPython has spent significant optimization effort on calls because calls are frequent and expensive. Important mechanisms include:
vectorcall
fast locals
specialized call bytecodes
inline caches
frame optimizations
reduced temporary tuple/dict creationThe goal is to avoid unnecessary argument packing. Historically, many calls required building tuples and dictionaries for arguments. Modern call paths try to pass arguments in array-like layouts when possible.
27.14 Returning From a Frame
A return instruction ends the current frame.
For:
def f():
return 42The return instruction produces a PyObject * result and unwinds the frame.
The caller receives that object as the result of the call expression:
x = f()Conceptually:
callee frame stack: [42]
RETURN_VALUE
pop result
finish callee frame
give result to caller
caller resumes with stack: [42]
STORE_FAST xA frame can finish in several ways:
| Exit path | Meaning |
|---|---|
| Normal return | Function returns a value |
| Exception | Function exits by raising |
| Generator yield | Frame suspends and later resumes |
| Coroutine await | Coroutine suspends |
| Fatal error | Runtime-level failure |
The evaluation loop must handle all of these paths.
27.15 Exceptions
Exceptions are part of normal interpreter control flow.
For:
def div(a, b):
return a / bIf b is zero, the division operation raises ZeroDivisionError.
The bytecode instruction does not return a normal result. Instead, it sets exception state and transfers control to exception handling logic.
Conceptually:
execute BINARY_OP /
operation fails
set current exception
search exception table
jump to handler or unwind frameModern CPython uses structured exception tables associated with code objects. These tables describe protected bytecode ranges and handlers. This allows the interpreter to find the correct handler when an exception occurs.
Example:
try:
x = 1 / y
except ZeroDivisionError:
x = 0The evaluation loop must know where the protected range is, where the handler starts, and what stack state is required at the handler.
27.16 Loops and Branches
Python loops compile to jumps.
Example:
def count(n):
i = 0
while i < n:
i += 1
return iConceptual bytecode shape:
i = 0
loop_start:
load i
load n
compare <
jump if false to loop_end
load i
load 1
add
store i
jump to loop_start
loop_end:
load i
returnThe evaluation loop does not have a special C-level while-loop for each Python while. It executes bytecode instructions that implement the loop.
A Python loop is therefore an interpreter loop inside the outer interpreter loop:
C evaluation loop
executes Python loop bytecode
jumps backward many timesThis is one reason tight Python loops can be expensive. Each iteration may execute many bytecode instructions, and each bytecode instruction has dispatch and dynamic object overhead.
27.17 Iteration
A for loop uses the iteration protocol.
Example:
for item in xs:
use(item)Conceptual execution:
iterator = iter(xs)
loop:
item = next(iterator)
if StopIteration:
exit loop
use(item)
jump loopThe evaluation loop executes instructions that call iter(), call the iterator’s next operation, handle StopIteration, and branch.
This means Python-level for loops are protocol-based. They work for lists, tuples, dicts, files, generators, custom iterators, and many extension types because the interpreter dispatches through object protocol slots.
27.18 Attribute Access
Attribute access is also dynamic.
For:
value = obj.nameThe interpreter must implement Python’s attribute lookup rules:
look at object type
handle descriptors
look in instance dictionary if applicable
look in class dictionary and base classes
call custom __getattribute__ if present
fall back to __getattr__ if applicable
raise AttributeError if missingA simple-looking expression can involve significant machinery.
Modern CPython uses inline caches and specialization to speed up common attribute access patterns. For example, repeated access to the same attribute on objects with stable shapes can avoid some repeated lookup work.
27.19 Global and Builtin Lookup
Global lookup is more expensive than local lookup.
For:
print(len(xs))Names such as print and len are not local variables unless assigned locally. CPython looks them up through global and builtin namespaces.
Conceptually:
look in globals dictionary
if missing, look in builtins dictionary
if missing, raise NameErrorThis is why local binding can be faster in tight loops:
def slow(xs):
for x in xs:
len(x)
def faster(xs):
local_len = len
for x in xs:
local_len(x)Modern CPython can specialize global lookups, so this old micro-optimization is less universally useful than it once was. Still, the underlying distinction remains: local slots are simpler than dictionary-based name lookup.
27.20 The GIL and the Evaluation Loop
In the traditional CPython runtime, the evaluation loop runs while the current thread holds the Global Interpreter Lock.
The GIL protects interpreter state, including reference counts and many object internals. The evaluation loop periodically checks whether it should drop the GIL, handle signals, process pending calls, or allow another thread to run.
This means bytecode execution is cooperative at the interpreter level. A thread does not usually hold the GIL forever. CPython has scheduling checks that allow switching between threads.
The practical consequence:
one thread executes Python bytecode at a time per traditional interpreter
I/O operations may release the GIL
C extensions may release the GIL around long native work
CPU-bound Python threads do not normally execute bytecode in parallelNewer CPython work includes free-threaded builds and per-interpreter changes, but the evaluation loop remains the central place where thread state, pending work, and bytecode execution meet.
27.21 Reference Counts During Execution
Every value on the stack is a Python object pointer with ownership rules.
The evaluation loop must carefully maintain reference counts. When an instruction pushes a value, stores a value, replaces a value, or discards a value, it must preserve object lifetime correctly.
Example:
x = a + bConceptually:
load a obtain reference to object a
load b obtain reference to object b
add produce new reference to result
store x bind result to local slot
discard temporariesIncorrect reference management would cause either leaks or premature destruction.
At the C level, this means carefully placed operations equivalent to:
Py_INCREF(obj);
Py_DECREF(obj);The exact implementation often uses specialized macros and ownership conventions. But the invariant is simple: an object must stay alive while it can still be used, and it must be released when the interpreter no longer owns a reference.
27.22 Error Signaling
Most C helper functions in CPython use a common convention:
return a valid pointer or success code on success
return NULL or error code on failure
set an exception on failureThe evaluation loop checks these results.
Simplified example:
PyObject *result = PyNumber_Add(left, right);
if (result == NULL) {
goto error;
}The NULL return does not by itself describe the exception. The exception is stored in thread state.
This pattern appears everywhere:
call helper
if failed:
go to error path
else:
push or store resultThe evaluation loop contains many error exits because almost any Python operation can fail:
allocation can fail
attribute lookup can fail
function call can fail
comparison can fail
iteration can fail
import can fail
descriptor code can fail
user-defined special method can fail27.23 Pending Calls, Signals, and Async Events
The evaluation loop also acts as a safe checkpoint for runtime-level work.
CPython cannot handle every signal or pending event at arbitrary C instruction boundaries. Instead, it records that something needs attention and checks at controlled points during evaluation.
Examples:
signal handling
pending calls from C APIs
thread switching requests
async exception injection
tracing and profiling hooks
monitoring hooks
interrupt checksThis keeps the interpreter manageable. The evaluation loop becomes the place where Python execution notices outside events.
27.24 Tracing and Profiling
Python supports tracing and profiling through APIs such as:
sys.settrace(...)
sys.setprofile(...)These hooks require cooperation from the evaluation loop.
The loop must emit events such as:
call
line
return
exception
opcode, when enabledTracing makes execution slower because it adds checks and callback calls. But it enables debuggers, coverage tools, profilers, teaching tools, and observability systems.
A debugger that steps through Python code depends on the evaluation loop’s ability to map bytecode execution back to source lines.
27.25 Specializing Adaptive Interpreter
Since Python 3.11, CPython has included a specializing adaptive interpreter based on PEP 659. The idea is to keep Python semantics dynamic while making common stable cases faster. PEP 659 describes specialization as aggressive over small regions, with adaptation when runtime patterns change.
The interpreter starts with general bytecode. As code runs, CPython observes behavior and may replace or augment generic operations with specialized forms.
For example, a generic binary operation may become optimized for common operand types:
generic BINARY_OP
observed int + int repeatedly
↓
specialized integer-add pathFor attribute access:
generic LOAD_ATTR
observed same attribute layout repeatedly
↓
cached attribute access pathFor global lookup:
generic LOAD_GLOBAL
observed stable globals and builtins dictionaries
↓
cached global lookup pathSpecialization must remain correct. If assumptions fail, the interpreter falls back or adapts.
This is not the same as a traditional full JIT compiler. It still operates inside the interpreter architecture. It specializes bytecode-level execution paths rather than compiling whole functions into native machine code in the general case.
27.26 Inline Caches
Inline caches are small pieces of cache storage associated with bytecode instructions.
Instead of recomputing lookup information every time, the interpreter stores facts near the instruction that needs them.
Example cache information may include:
type version
dictionary version
attribute offset
resolved descriptor
global dictionary version
builtin dictionary version
specialized call targetA simplified attribute cache model:
LOAD_ATTR name
cache:
expected type = User
type version = 123
attribute offset = 2On the next execution, CPython can check whether the object still matches the cached assumptions. If yes, it uses the fast path. If no, it falls back to the generic path.
Inline caches work well because bytecode instructions at a given source location often see the same kinds of objects repeatedly.
27.27 Why Specialization Is Safe
Python is dynamic, so specialization must be guarded.
A specialized path is valid only while its assumptions remain true.
For example:
obj.xcan be specialized if CPython observes a stable object layout. But Python allows mutation:
obj.__dict__["x"] = 10
type(obj).x = property(...)
obj.__class__ = OtherTypeSo CPython uses version tags, guards, counters, and fallback paths.
The safety rule is:
use fast path only if guards prove assumptions still hold
otherwise use generic Python semanticsThis is the same broad strategy used by many dynamic language runtimes, but CPython keeps the machinery relatively close to the bytecode interpreter.
27.28 Generated Interpreter Code
Modern CPython does not treat every bytecode implementation as hand-written switch cases in one file.
Parts of the interpreter are generated from instruction definitions. This helps keep bytecode metadata, stack effects, specialization information, and dispatch code more consistent.
The broad idea:
instruction definitions
↓
generated opcode metadata
↓
generated dispatch support
↓
interpreter executionFor a reader, this means the source of truth may not always be the final generated C file alone. You often need to inspect the instruction definition files, generated headers, and build outputs.
The exact files and generation pipeline can change across CPython releases, so use the source tree for the version you are studying.
27.29 The Evaluation Loop and C Calls
The evaluation loop frequently calls C helper functions.
Examples:
PyNumber_Add
PyObject_GetAttr
PyObject_SetAttr
PyObject_Call
PyDict_GetItem
PyObject_RichCompare
PyIter_NextThese helpers may call user-defined Python code.
For example:
a + bmay call:
a.__add__(b)And:
obj.namemay call:
obj.__getattribute__("name")So the evaluation loop can reenter Python execution indirectly. A bytecode instruction may call C helper code, which may call Python code, which creates another frame, which starts another evaluation loop execution.
Conceptually:
frame A
executes BINARY_OP
calls C helper
calls user __add__
frame B
evaluation loopThis recursive execution model is central to Python’s flexibility.
27.30 Recursion and Call Depth
Python protects against uncontrolled recursion.
Example:
def f():
return f()
f()Each call creates another Python frame. CPython tracks recursion depth and raises RecursionError when the configured limit is exceeded.
The evaluation loop and call machinery must cooperate with this check. Without it, recursive Python code could exhaust the C stack or process memory.
You can inspect and adjust the limit:
import sys
print(sys.getrecursionlimit())
sys.setrecursionlimit(2000)Raising the recursion limit should be done carefully. The Python limit exists partly to protect lower-level runtime resources.
27.31 Generators
Generators change the frame lifecycle.
A normal function call runs until it returns or raises. A generator can suspend and resume.
Example:
def gen():
yield 1
yield 2Calling gen() does not immediately run the function body to completion. It creates a generator object that owns a suspended frame or equivalent execution state.
Each next() resumes execution:
first next()
enter frame
run until yield 1
suspend frame
second next()
resume frame
run until yield 2
suspend frame
third next()
resume frame
finish function
raise StopIterationThe evaluation loop must support suspension. It cannot simply destroy the frame at yield.
27.32 Coroutines and Await
Coroutines extend the same suspension model.
Example:
async def fetch():
data = await read()
return dataAn await may suspend the coroutine until another awaitable completes.
The evaluation loop must support:
coroutine frame creation
suspension at await
resumption with value
resumption with exception
final return
cancellation behaviorAsync execution is therefore not a separate interpreter. It is built on the same frame and bytecode machinery, with specific instructions and protocols for suspension and resumption.
27.33 Class Bodies and Module Bodies
The evaluation loop does not only execute functions.
It also executes module bodies and class bodies.
A module file:
x = 1
def f():
return xis compiled into a module-level code object. Importing or running the module executes that code object.
A class statement also executes code:
class C:
x = 1
def method(self):
return self.xThe class body runs in a namespace prepared for class construction. After execution, CPython builds the class object from that namespace.
So the evaluation loop executes several block kinds:
| Block kind | Example |
|---|---|
| Module | .py file body |
| Function | def f(): ... |
| Class body | class C: ... |
| Lambda | lambda x: x + 1 |
| Comprehension | [x * 2 for x in xs] |
| Generator | (x for x in xs) |
| Coroutine | async def f(): ... |
27.34 Comprehensions
Comprehensions compile to their own code objects in many cases.
Example:
ys = [x * 2 for x in xs if x > 0]Conceptually:
create list
iterate xs
for each x:
if x > 0:
append x * 2
return listThis means comprehensions often run through a nested frame or specialized internal execution path. They have their own local scope behavior, which is why loop variables inside list comprehensions do not leak into the surrounding scope in Python 3.
The evaluation loop sees comprehension execution as bytecode execution, not as a special syntax form.
27.35 Import Execution
Imports also eventually execute bytecode.
When Python imports a .py module, the import system finds the module, reads the source or cached bytecode, creates a module object, then executes the module code object.
Conceptually:
import module
find spec
create module object
compile or load code object
execute code object in module namespaceThe evaluation loop therefore participates in imports. Importing a module means running code.
This is why import-time side effects happen:
# module.py
print("imported")import moduleThe print runs because module body execution is ordinary code execution.
27.36 Performance Model
The evaluation loop explains much of Python performance.
A Python operation often has several layers of cost:
bytecode dispatch
stack manipulation
reference count updates
dynamic type checks
dictionary lookup
descriptor protocol
function call overhead
allocation
error checksFor example:
obj.x + ymay require:
LOAD_FAST obj
LOAD_ATTR x
LOAD_FAST y
BINARY_OP +Each instruction has interpreter overhead. LOAD_ATTR may involve descriptor lookup. BINARY_OP may involve numeric dispatch. Reference counts must be maintained. Errors must be checked.
This is why moving hot loops into C extensions, vectorized libraries, or built-in operations can be much faster. They reduce the number of bytecode instructions and dynamic dispatches executed by the evaluation loop.
27.37 Built-ins as Evaluation Loop Escape Hatches
Built-in operations can perform large amounts of work below the bytecode level.
Example:
sum(xs)The evaluation loop executes the call to sum, but the loop over elements may run in C inside the built-in implementation.
Compare:
total = 0
for x in xs:
total += xThis requires many bytecode instructions per iteration.
The built-in can reduce interpreter overhead because much of the repeated work happens in C.
This is a common Python performance principle:
fewer Python bytecode instructions in hot paths usually means better performance27.38 Inspecting the Evaluation Loop From Python
You can study bytecode with dis:
import dis
def f(a, b):
c = a + b
return c
dis.dis(f)You can inspect frames:
import inspect
def f():
frame = inspect.currentframe()
print(frame.f_code.co_name)
print(frame.f_locals)
f()You can inspect call depth:
import sys
def f(n):
frame = sys._getframe()
print(n, frame.f_code.co_name)
if n:
f(n - 1)
f(3)You can trace execution:
import sys
def trace(frame, event, arg):
print(event, frame.f_code.co_name, frame.f_lineno)
return trace
def f(x):
y = x + 1
return y
sys.settrace(trace)
f(10)
sys.settrace(None)These tools expose part of the machinery that the evaluation loop maintains internally.
27.39 A Simplified Evaluation Loop
A teaching version of the loop might look like this:
PyObject *
eval_frame(Frame *frame)
{
for (;;) {
Instruction instr = next_instruction(frame);
switch (instr.opcode) {
case OP_LOAD_CONST: {
PyObject *value = frame->code->consts[instr.arg];
push(frame, value);
break;
}
case OP_LOAD_FAST: {
PyObject *value = frame->locals[instr.arg];
if (value == NULL) {
raise_unbound_local_error();
goto error;
}
push(frame, value);
break;
}
case OP_STORE_FAST: {
PyObject *value = pop(frame);
frame->locals[instr.arg] = value;
break;
}
case OP_BINARY_ADD: {
PyObject *right = pop(frame);
PyObject *left = pop(frame);
PyObject *result = PyNumber_Add(left, right);
if (result == NULL) {
goto error;
}
push(frame, result);
break;
}
case OP_RETURN_VALUE: {
PyObject *result = pop(frame);
return result;
}
}
}
error:
return NULL;
}This omits most real details:
reference ownership
specialization
inline caches
exception tables
tracing
profiling
GIL checks
pending calls
signals
generators
coroutines
debug builds
statistics
opcode prediction
deoptimization
frame materializationBut it captures the essential idea.
27.40 Common Misunderstandings
| Misunderstanding | Correct model |
|---|---|
| CPython executes source text directly | CPython executes compiled code objects |
| Python variables store raw values | Names and slots hold references to objects |
| Bytecode is stable across versions | Bytecode is a CPython implementation detail |
a + b is simple machine addition | It is dynamic object protocol dispatch, unless specialized |
| A frame is only a traceback object | A frame is active execution state |
| The GIL only affects user threads | It is deeply connected to interpreter execution and object safety |
| Exceptions are rare side paths only | Exceptions are integrated into normal control flow machinery |
| Generators are special functions only | They are resumable execution frames or equivalent state |
27.41 Reading the Real Source
When reading the real CPython source, use this order:
- Start with
disoutput for a small Python function. - Identify the bytecode instructions.
- Find the corresponding opcode definitions.
- Find the generated or handwritten interpreter implementation.
- Follow helper calls for object operations.
- Track reference ownership.
- Track stack effects.
- Track error paths.
- Check specialization and cache behavior.
- Compare behavior across Python versions.
A good study function is:
def example(obj, xs):
total = 0
for x in xs:
total += obj.value + x
return totalThis function touches many interpreter paths:
local variable access
loop iteration
attribute lookup
binary operation
in-place update semantics
jump instructions
returnDisassemble it, then map each instruction to the interpreter machinery.
27.42 Chapter Summary
The evaluation loop is where compiled Python code becomes running Python behavior. It executes code objects through frames, uses a stack-based bytecode model, dispatches instructions, maintains references, handles exceptions, calls functions, checks runtime events, and applies specialization when possible.
The loop is small in concept but large in consequence. It sits at the junction of nearly every CPython subsystem:
compiler
frames
objects
types
reference counting
garbage collection
exceptions
calls
imports
generators
coroutines
tracing
profiling
threading
optimizationTo understand CPython, you must understand the evaluation loop. It is the machine inside the machine.