Instruction emission, jump fixup, and the assembler in Python/compile.c that produces the final bytecode array.
Bytecode generation is the stage where CPython transforms structured syntax into executable virtual machine instructions.
The parser builds an AST.
The symbol table determines scope behavior.
The compiler then emits bytecode instructions that implement Python semantics.
For this source:
def add(a, b):
return a + bCPython generates bytecode shaped like:
LOAD_FAST a
LOAD_FAST b
BINARY_OP +
RETURN_VALUEThe exact instruction names and formats vary across Python versions, but the model remains stable:
bytecode is a low-level instruction stream
executed by the CPython virtual machine25.1 Position in the Compilation Pipeline
Bytecode generation happens after AST construction and scope analysis.
source
↓
tokenization
↓
parsing
↓
AST
↓
symbol table
↓
bytecode generation
↓
code object
↓
evaluation loopThe compiler walks the AST and emits instructions plus metadata.
The output becomes part of a code object.
25.2 What Bytecode Represents
Bytecode is a virtual instruction set for CPython.
It is not machine code.
It is not source code.
It is an intermediate execution language designed for the CPython interpreter.
Example source:
x = 1 + 2Possible bytecode:
LOAD_CONST 3
STORE_NAME x
LOAD_CONST None
RETURN_VALUEThe interpreter later executes these instructions one by one.
The compiler’s job is:
preserve Python semantics
emit correct stack behavior
emit correct control flow
emit correct scope access
record source metadata25.3 Bytecode Is Stack-Based
CPython bytecode uses a stack machine.
Most instructions push or pop values from the evaluation stack.
Example:
a + bCompilation:
LOAD_FAST a
LOAD_FAST b
BINARY_OP +Stack evolution:
LOAD_FAST a
stack: a
LOAD_FAST b
stack: a, b
BINARY_OP +
pop a and b
push result
stack: resultThe compiler must keep stack effects consistent.
Every bytecode path must maintain valid stack state.
25.4 Expression Compilation
Expressions produce values.
Example:
x + y * zThe AST preserves precedence:
x + (y * z)Bytecode generation follows that structure.
Conceptual bytecode:
LOAD_FAST x
LOAD_FAST y
LOAD_FAST z
BINARY_OP *
BINARY_OP +Stack evolution:
x
x, y
x, y, z
x, temp
resultThe compiler recursively compiles subexpressions.
25.5 Statement Compilation
Statements usually emit side-effect instructions.
Example:
x = valueCompilation:
compile expression value
store into target xPossible bytecode:
LOAD_FAST value
STORE_FAST xExample:
return xCompilation:
LOAD_FAST x
RETURN_VALUEThe compiler distinguishes between:
expressions producing values
statements performing actions25.6 Loading Constants
Constants are loaded from co_consts.
Example:
x = 123Bytecode:
LOAD_CONST 123
STORE_NAME xThe compiler inserts the constant into co_consts and emits an index reference.
Conceptually:
co_consts:
0: 123
instruction:
LOAD_CONST 0The interpreter resolves the index during execution.
25.7 Loading Locals
Fast locals use indexed slots.
Example:
def f(a, b):
return a + bBytecode:
LOAD_FAST a
LOAD_FAST b
BINARY_OP +
RETURN_VALUEThe compiler uses LOAD_FAST because symbol analysis classified a and b as local variables.
Fast locals avoid dictionary lookup.
25.8 Loading Globals
Global and builtin lookups use different instructions.
Example:
x = 10
def f():
return xInside f:
LOAD_GLOBAL x
RETURN_VALUEThe interpreter checks:
function globals
then builtinsThe compiler selects LOAD_GLOBAL based on symbol table information.
25.9 Loading Closure Variables
Closure variables use dereference operations.
Example:
def outer():
x = 1
def inner():
return xInside inner:
LOAD_DEREF x
RETURN_VALUEThe compiler emits dereference bytecode because x is a free variable captured from an enclosing scope.
Closure bytecode accesses cell objects rather than ordinary local slots.
25.10 Assignment Targets
Assignment targets compile differently from ordinary expressions.
Example:
x = 1Bytecode:
LOAD_CONST 1
STORE_NAME xBut attribute assignment:
obj.value = 1Bytecode shape:
LOAD_FAST obj
LOAD_CONST 1
STORE_ATTR valueSubscript assignment:
items[i] = valueBytecode shape:
LOAD_FAST items
LOAD_FAST i
LOAD_FAST value
STORE_SUBSCRTarget compilation depends on AST context.
25.11 Deletion
Deletion uses dedicated instructions.
Example:
del xPossible bytecode:
DELETE_FAST xExample:
del obj.attrPossible bytecode:
LOAD_FAST obj
DELETE_ATTR attrDeletion is not assignment to None. It removes bindings or object entries according to target type.
25.12 Function Calls
Function calls generate multiple instructions.
Example:
f(x, y)Conceptual bytecode:
LOAD_NAME f
LOAD_FAST x
LOAD_FAST y
CALL 2
POP_TOPThe compiler must:
compile callable expression
compile positional arguments
compile keyword arguments
emit call instruction
handle stack layoutMethod calls may use specialized bytecode forms.
Example:
obj.run()Possible shape:
LOAD_FAST obj
LOAD_METHOD run
CALL 0Modern CPython versions contain additional specialization and inline cache behavior around calls.
25.13 Binary Operations
Arithmetic and binary operations emit operation instructions.
Example:
a + bBytecode:
LOAD_FAST a
LOAD_FAST b
BINARY_OP +Other examples:
| Expression | Operation |
|---|---|
a - b | subtraction |
a * b | multiplication |
a / b | division |
a // b | floor division |
a % b | modulo |
a ** b | power |
a @ b | matrix multiply |
a << b | left shift |
a & b | bitwise and |
The compiler emits operation instructions. Runtime type dispatch happens later.
Example:
1 + 2
"a" + "b"Both compile similarly, but runtime object behavior differs.
25.14 Comparisons
Comparisons emit comparison operations.
Example:
a < bPossible bytecode:
LOAD_FAST a
LOAD_FAST b
COMPARE_OP <Chained comparisons require more complex control flow.
Example:
a < b < cThis must evaluate b once.
Conceptual compilation:
LOAD_FAST a
LOAD_FAST b
COMPARE_OP <
conditional jump if false
LOAD_FAST b
LOAD_FAST c
COMPARE_OP <The compiler preserves Python’s chained comparison semantics.
25.15 Boolean Operations
Boolean operations short-circuit.
Example:
a and bCompilation pattern:
evaluate a
jump if false
evaluate bb executes only if needed.
Similarly:
a or bevaluates b only if a is false.
Short-circuit behavior is implemented through jumps, not ordinary function calls.
25.16 Conditional Expressions
Example:
x if cond else yCompilation pattern:
evaluate cond
jump to else branch if false
evaluate x
jump to end
evaluate yConditional expressions are expressions, not statements. They must leave one value on the stack regardless of branch taken.
25.17 if Statements
Example:
if cond:
a()
else:
b()Compilation pattern:
compile condition
jump to else if false
compile a()
jump to end
compile b()
endPossible bytecode shape:
LOAD_NAME cond
POP_JUMP_IF_FALSE else_label
LOAD_NAME a
CALL
POP_TOP
JUMP_FORWARD end_label
else_label:
LOAD_NAME b
CALL
POP_TOP
end_label:The compiler manages labels and jump targets internally before final assembly.
25.18 while Loops
Example:
while cond:
work()Compilation pattern:
loop_start:
evaluate cond
jump to end if false
compile body
jump to loop_start
loop_end:Loops require block stack tracking for:
break
continue
exception cleanup25.19 for Loops
Example:
for item in items:
work(item)Compilation pattern:
load iterable
get iterator
loop_start:
get next item
jump to end on StopIteration
store item
compile body
jump to loop_start
loop_end:The compiler emits iterator protocol bytecode.
A Python for loop is iterator-driven, not index-driven.
25.20 break and continue
Example:
while True:
if stop:
break
continuebreak jumps to loop exit.
continue jumps to loop continuation point.
The compiler maintains loop context structures so nested loops behave correctly.
Example:
for x in xs:
for y in ys:
breakThe inner break exits only the inner loop.
25.21 Exception Handling
Exception handling requires structured control flow metadata.
Example:
try:
risky()
except ValueError:
recover()
finally:
cleanup()Compilation responsibilities:
protected instruction ranges
exception handler targets
finally cleanup
reraising behavior
stack restorationModern CPython uses exception tables associated with the code object.
The compiler records:
instruction range
handler entry
handler type
stack depth informationThis metadata lets the interpreter jump into handlers correctly when exceptions occur.
25.22 with Statements
Example:
with open(path) as f:
data = f.read()Compilation pattern:
evaluate context manager
call __enter__
store result
execute body
ensure __exit__ runs
handle exceptions correctlyThe compiler emits cleanup logic ensuring __exit__ executes even when exceptions occur.
with compilation is tightly connected to exception handling machinery.
25.23 Function Definitions
Function definitions compile in two stages.
Example:
def f(x):
return x + 1Stage 1:
compile function body into nested code objectStage 2:
emit runtime instructions creating function objectConceptual bytecode:
LOAD_CONST <code object f>
MAKE_FUNCTION
STORE_NAME fThe body itself becomes bytecode inside the nested code object.
25.24 Closures
Example:
def outer():
x = 1
def inner():
return xCompilation responsibilities:
create closure cell for x
compile inner with free variable access
pass closure tuple during function creationPossible bytecode shape inside outer:
MAKE_CELL x
LOAD_CONST 1
STORE_DEREF x
LOAD_CLOSURE x
BUILD_TUPLE 1
LOAD_CONST <code object inner>
MAKE_FUNCTION closureInside inner:
LOAD_DEREF x
RETURN_VALUE25.25 Class Definitions
Example:
class C:
x = 1The class body becomes a nested code object.
Compilation pattern:
compile class body code object
emit runtime class construction logic
bind resulting class objectClass bodies execute like mini modules with their own namespace.
Methods become nested function definitions inside the class body code object.
25.26 Comprehensions
Comprehensions compile into nested scopes.
Example:
[x * x for x in xs]Compilation responsibilities:
create nested comprehension code object
iterate input iterable
bind local iteration variable
append results
return constructed containerComprehension variables do not leak into outer scope because the compiler creates separate execution scope machinery.
25.27 Generators
Generator functions use suspension points.
Example:
def gen():
yield 1
yield 2Compilation responsibilities:
mark code object as generator
emit yield instructions
preserve resumable execution statePossible bytecode shape:
LOAD_CONST 1
YIELD_VALUE
LOAD_CONST 2
YIELD_VALUE
LOAD_CONST None
RETURN_VALUEThe frame must preserve state across suspension.
25.28 Coroutines and await
Example:
async def fetch():
return await client.get()Compilation responsibilities:
mark coroutine flags
emit await handling
preserve suspension semanticsThe compiler generates bytecode for coroutine scheduling behavior rather than ordinary synchronous calls.
25.29 Imports
Example:
import osCompilation pattern:
IMPORT_NAME os
STORE_NAME osExample:
from math import sinCompilation pattern:
IMPORT_NAME math
IMPORT_FROM sin
STORE_NAME sinThe compiler emits import operations. Actual module loading happens at runtime.
25.30 Assertions
Example:
assert x > 0Compilation pattern:
evaluate condition
jump if true
raise AssertionErrorUnder optimization mode (python -O), assert statements may be omitted entirely.
This is a compiler-level transformation.
25.31 Source Locations and Line Tables
The compiler records mappings between bytecode and source positions.
These mappings support:
tracebacks
debuggers
profilers
coverage tools
stepping
error reportingEach instruction range may correspond to:
line number
column offset
end line
end columnThe code object stores compressed mapping tables.
25.32 Stack Size Computation
The compiler computes maximum stack depth.
Example:
a + b * cPossible stack evolution:
a
a, b
a, b, c
a, temp
resultMaximum depth: 3.
This becomes co_stacksize.
Frames allocate enough stack space based on this value.
25.33 Basic Blocks
Internally, the compiler often groups instructions into basic blocks.
A basic block is a sequence of instructions with:
single entry
single exit
no internal jumpsExample:
if cond:
a()
b()Possible block structure:
block 1:
evaluate cond
conditional jump
block 2:
a()
jump
block 3:
b()Basic blocks simplify control-flow analysis and jump resolution.
25.34 Jump Resolution
The compiler initially emits symbolic labels.
Later assembly resolves actual instruction offsets.
Conceptual process:
emit instructions
emit labels
calculate instruction offsets
replace labels with offsets
insert extended arguments if needed
recalculate offsets if sizes changedJump resolution is one reason compilation is multi-stage.
25.35 Inline Caches and Specialization Support
Modern CPython bytecode supports adaptive specialization.
The compiler may emit cache entries associated with instructions.
Example operations benefiting from specialization:
attribute access
global lookup
binary operations
calls
method dispatchInitial bytecode is generic.
Runtime specialization may later replace behavior with optimized fast paths.
The compiler prepares instruction layouts that allow this adaptation.
25.36 Bytecode Inspection
Use dis to inspect generated bytecode.
Example:
import dis
def f(a, b):
return a + b
dis.dis(f)Useful inspection functions include:
dis.dis
dis.Bytecode
dis.get_instructionsBytecode inspection is essential for:
compiler debugging
performance analysis
tooling
education
reverse engineering Python behavior25.37 Version Sensitivity
Bytecode changes between CPython versions.
Changes may include:
new opcodes
removed opcodes
opcode renaming
instruction format changes
specialization changes
exception handling changes
line table changesTools should avoid depending on exact bytecode layouts across versions unless version-specific support exists.
Use public APIs rather than parsing raw bytecode manually whenever possible.
25.38 Important CPython Source Areas
Important files include:
Python/compile.c
Python/flowgraph.c
Python/assemble.c
Python/bytecodes.c
Include/opcode_ids.h
Lib/dis.py
Lib/opcode.pyConceptual roles:
| Area | Role |
|---|---|
compile.c | AST traversal and instruction emission |
flowgraph.c | Control-flow graph handling |
assemble.c | Bytecode assembly and jump resolution |
bytecodes.c | Opcode definitions |
opcode.py | Opcode metadata |
dis.py | Bytecode inspection |
25.39 Minimal Mental Model
Use this model:
The compiler walks the AST.
Expressions emit stack-based instructions.
Statements emit side-effect and control-flow instructions.
Constants, names, and locals become indexed table references.
Control flow becomes jumps and exception metadata.
Functions, classes, comprehensions, and generators create nested code objects.
The final instruction stream becomes part of a code object executed by the CPython virtual machine.Bytecode generation is the stage where Python syntax becomes executable virtual machine operations.