Skip to content

25. Bytecode Generation

Instruction emission, jump fixup, and the assembler in Python/compile.c that produces the final bytecode array.

Bytecode generation is the stage where CPython transforms structured syntax into executable virtual machine instructions.

The parser builds an AST.

The symbol table determines scope behavior.

The compiler then emits bytecode instructions that implement Python semantics.

For this source:

def add(a, b):
    return a + b

CPython generates bytecode shaped like:

LOAD_FAST a
LOAD_FAST b
BINARY_OP +
RETURN_VALUE

The exact instruction names and formats vary across Python versions, but the model remains stable:

bytecode is a low-level instruction stream
executed by the CPython virtual machine

25.1 Position in the Compilation Pipeline

Bytecode generation happens after AST construction and scope analysis.

source
tokenization
parsing
AST
symbol table
bytecode generation
code object
evaluation loop

The compiler walks the AST and emits instructions plus metadata.

The output becomes part of a code object.

25.2 What Bytecode Represents

Bytecode is a virtual instruction set for CPython.

It is not machine code.

It is not source code.

It is an intermediate execution language designed for the CPython interpreter.

Example source:

x = 1 + 2

Possible bytecode:

LOAD_CONST 3
STORE_NAME x
LOAD_CONST None
RETURN_VALUE

The interpreter later executes these instructions one by one.

The compiler’s job is:

preserve Python semantics
emit correct stack behavior
emit correct control flow
emit correct scope access
record source metadata

25.3 Bytecode Is Stack-Based

CPython bytecode uses a stack machine.

Most instructions push or pop values from the evaluation stack.

Example:

a + b

Compilation:

LOAD_FAST a
LOAD_FAST b
BINARY_OP +

Stack evolution:

LOAD_FAST a
    stack: a

LOAD_FAST b
    stack: a, b

BINARY_OP +
    pop a and b
    push result
    stack: result

The compiler must keep stack effects consistent.

Every bytecode path must maintain valid stack state.

25.4 Expression Compilation

Expressions produce values.

Example:

x + y * z

The AST preserves precedence:

x + (y * z)

Bytecode generation follows that structure.

Conceptual bytecode:

LOAD_FAST x
LOAD_FAST y
LOAD_FAST z
BINARY_OP *
BINARY_OP +

Stack evolution:

x
x, y
x, y, z
x, temp
result

The compiler recursively compiles subexpressions.

25.5 Statement Compilation

Statements usually emit side-effect instructions.

Example:

x = value

Compilation:

compile expression value
store into target x

Possible bytecode:

LOAD_FAST value
STORE_FAST x

Example:

return x

Compilation:

LOAD_FAST x
RETURN_VALUE

The compiler distinguishes between:

expressions producing values
statements performing actions

25.6 Loading Constants

Constants are loaded from co_consts.

Example:

x = 123

Bytecode:

LOAD_CONST 123
STORE_NAME x

The compiler inserts the constant into co_consts and emits an index reference.

Conceptually:

co_consts:
    0: 123

instruction:
    LOAD_CONST 0

The interpreter resolves the index during execution.

25.7 Loading Locals

Fast locals use indexed slots.

Example:

def f(a, b):
    return a + b

Bytecode:

LOAD_FAST a
LOAD_FAST b
BINARY_OP +
RETURN_VALUE

The compiler uses LOAD_FAST because symbol analysis classified a and b as local variables.

Fast locals avoid dictionary lookup.

25.8 Loading Globals

Global and builtin lookups use different instructions.

Example:

x = 10

def f():
    return x

Inside f:

LOAD_GLOBAL x
RETURN_VALUE

The interpreter checks:

function globals
then builtins

The compiler selects LOAD_GLOBAL based on symbol table information.

25.9 Loading Closure Variables

Closure variables use dereference operations.

Example:

def outer():
    x = 1

    def inner():
        return x

Inside inner:

LOAD_DEREF x
RETURN_VALUE

The compiler emits dereference bytecode because x is a free variable captured from an enclosing scope.

Closure bytecode accesses cell objects rather than ordinary local slots.

25.10 Assignment Targets

Assignment targets compile differently from ordinary expressions.

Example:

x = 1

Bytecode:

LOAD_CONST 1
STORE_NAME x

But attribute assignment:

obj.value = 1

Bytecode shape:

LOAD_FAST obj
LOAD_CONST 1
STORE_ATTR value

Subscript assignment:

items[i] = value

Bytecode shape:

LOAD_FAST items
LOAD_FAST i
LOAD_FAST value
STORE_SUBSCR

Target compilation depends on AST context.

25.11 Deletion

Deletion uses dedicated instructions.

Example:

del x

Possible bytecode:

DELETE_FAST x

Example:

del obj.attr

Possible bytecode:

LOAD_FAST obj
DELETE_ATTR attr

Deletion is not assignment to None. It removes bindings or object entries according to target type.

25.12 Function Calls

Function calls generate multiple instructions.

Example:

f(x, y)

Conceptual bytecode:

LOAD_NAME f
LOAD_FAST x
LOAD_FAST y
CALL 2
POP_TOP

The compiler must:

compile callable expression
compile positional arguments
compile keyword arguments
emit call instruction
handle stack layout

Method calls may use specialized bytecode forms.

Example:

obj.run()

Possible shape:

LOAD_FAST obj
LOAD_METHOD run
CALL 0

Modern CPython versions contain additional specialization and inline cache behavior around calls.

25.13 Binary Operations

Arithmetic and binary operations emit operation instructions.

Example:

a + b

Bytecode:

LOAD_FAST a
LOAD_FAST b
BINARY_OP +

Other examples:

ExpressionOperation
a - bsubtraction
a * bmultiplication
a / bdivision
a // bfloor division
a % bmodulo
a ** bpower
a @ bmatrix multiply
a << bleft shift
a & bbitwise and

The compiler emits operation instructions. Runtime type dispatch happens later.

Example:

1 + 2
"a" + "b"

Both compile similarly, but runtime object behavior differs.

25.14 Comparisons

Comparisons emit comparison operations.

Example:

a < b

Possible bytecode:

LOAD_FAST a
LOAD_FAST b
COMPARE_OP <

Chained comparisons require more complex control flow.

Example:

a < b < c

This must evaluate b once.

Conceptual compilation:

LOAD_FAST a
LOAD_FAST b
COMPARE_OP <
conditional jump if false
LOAD_FAST b
LOAD_FAST c
COMPARE_OP <

The compiler preserves Python’s chained comparison semantics.

25.15 Boolean Operations

Boolean operations short-circuit.

Example:

a and b

Compilation pattern:

evaluate a
jump if false
evaluate b

b executes only if needed.

Similarly:

a or b

evaluates b only if a is false.

Short-circuit behavior is implemented through jumps, not ordinary function calls.

25.16 Conditional Expressions

Example:

x if cond else y

Compilation pattern:

evaluate cond
jump to else branch if false
evaluate x
jump to end
evaluate y

Conditional expressions are expressions, not statements. They must leave one value on the stack regardless of branch taken.

25.17 if Statements

Example:

if cond:
    a()
else:
    b()

Compilation pattern:

compile condition
jump to else if false
compile a()
jump to end
compile b()
end

Possible bytecode shape:

LOAD_NAME cond
POP_JUMP_IF_FALSE else_label

LOAD_NAME a
CALL
POP_TOP
JUMP_FORWARD end_label

else_label:
LOAD_NAME b
CALL
POP_TOP

end_label:

The compiler manages labels and jump targets internally before final assembly.

25.18 while Loops

Example:

while cond:
    work()

Compilation pattern:

loop_start:
    evaluate cond
    jump to end if false
    compile body
    jump to loop_start
loop_end:

Loops require block stack tracking for:

break
continue
exception cleanup

25.19 for Loops

Example:

for item in items:
    work(item)

Compilation pattern:

load iterable
get iterator

loop_start:
    get next item
    jump to end on StopIteration
    store item
    compile body
    jump to loop_start

loop_end:

The compiler emits iterator protocol bytecode.

A Python for loop is iterator-driven, not index-driven.

25.20 break and continue

Example:

while True:
    if stop:
        break

    continue

break jumps to loop exit.

continue jumps to loop continuation point.

The compiler maintains loop context structures so nested loops behave correctly.

Example:

for x in xs:
    for y in ys:
        break

The inner break exits only the inner loop.

25.21 Exception Handling

Exception handling requires structured control flow metadata.

Example:

try:
    risky()
except ValueError:
    recover()
finally:
    cleanup()

Compilation responsibilities:

protected instruction ranges
exception handler targets
finally cleanup
reraising behavior
stack restoration

Modern CPython uses exception tables associated with the code object.

The compiler records:

instruction range
handler entry
handler type
stack depth information

This metadata lets the interpreter jump into handlers correctly when exceptions occur.

25.22 with Statements

Example:

with open(path) as f:
    data = f.read()

Compilation pattern:

evaluate context manager
call __enter__
store result
execute body
ensure __exit__ runs
handle exceptions correctly

The compiler emits cleanup logic ensuring __exit__ executes even when exceptions occur.

with compilation is tightly connected to exception handling machinery.

25.23 Function Definitions

Function definitions compile in two stages.

Example:

def f(x):
    return x + 1

Stage 1:

compile function body into nested code object

Stage 2:

emit runtime instructions creating function object

Conceptual bytecode:

LOAD_CONST <code object f>
MAKE_FUNCTION
STORE_NAME f

The body itself becomes bytecode inside the nested code object.

25.24 Closures

Example:

def outer():
    x = 1

    def inner():
        return x

Compilation responsibilities:

create closure cell for x
compile inner with free variable access
pass closure tuple during function creation

Possible bytecode shape inside outer:

MAKE_CELL x
LOAD_CONST 1
STORE_DEREF x

LOAD_CLOSURE x
BUILD_TUPLE 1
LOAD_CONST <code object inner>
MAKE_FUNCTION closure

Inside inner:

LOAD_DEREF x
RETURN_VALUE

25.25 Class Definitions

Example:

class C:
    x = 1

The class body becomes a nested code object.

Compilation pattern:

compile class body code object
emit runtime class construction logic
bind resulting class object

Class bodies execute like mini modules with their own namespace.

Methods become nested function definitions inside the class body code object.

25.26 Comprehensions

Comprehensions compile into nested scopes.

Example:

[x * x for x in xs]

Compilation responsibilities:

create nested comprehension code object
iterate input iterable
bind local iteration variable
append results
return constructed container

Comprehension variables do not leak into outer scope because the compiler creates separate execution scope machinery.

25.27 Generators

Generator functions use suspension points.

Example:

def gen():
    yield 1
    yield 2

Compilation responsibilities:

mark code object as generator
emit yield instructions
preserve resumable execution state

Possible bytecode shape:

LOAD_CONST 1
YIELD_VALUE

LOAD_CONST 2
YIELD_VALUE

LOAD_CONST None
RETURN_VALUE

The frame must preserve state across suspension.

25.28 Coroutines and await

Example:

async def fetch():
    return await client.get()

Compilation responsibilities:

mark coroutine flags
emit await handling
preserve suspension semantics

The compiler generates bytecode for coroutine scheduling behavior rather than ordinary synchronous calls.

25.29 Imports

Example:

import os

Compilation pattern:

IMPORT_NAME os
STORE_NAME os

Example:

from math import sin

Compilation pattern:

IMPORT_NAME math
IMPORT_FROM sin
STORE_NAME sin

The compiler emits import operations. Actual module loading happens at runtime.

25.30 Assertions

Example:

assert x > 0

Compilation pattern:

evaluate condition
jump if true
raise AssertionError

Under optimization mode (python -O), assert statements may be omitted entirely.

This is a compiler-level transformation.

25.31 Source Locations and Line Tables

The compiler records mappings between bytecode and source positions.

These mappings support:

tracebacks
debuggers
profilers
coverage tools
stepping
error reporting

Each instruction range may correspond to:

line number
column offset
end line
end column

The code object stores compressed mapping tables.

25.32 Stack Size Computation

The compiler computes maximum stack depth.

Example:

a + b * c

Possible stack evolution:

a
a, b
a, b, c
a, temp
result

Maximum depth: 3.

This becomes co_stacksize.

Frames allocate enough stack space based on this value.

25.33 Basic Blocks

Internally, the compiler often groups instructions into basic blocks.

A basic block is a sequence of instructions with:

single entry
single exit
no internal jumps

Example:

if cond:
    a()
b()

Possible block structure:

block 1:
    evaluate cond
    conditional jump

block 2:
    a()
    jump

block 3:
    b()

Basic blocks simplify control-flow analysis and jump resolution.

25.34 Jump Resolution

The compiler initially emits symbolic labels.

Later assembly resolves actual instruction offsets.

Conceptual process:

emit instructions
emit labels
calculate instruction offsets
replace labels with offsets
insert extended arguments if needed
recalculate offsets if sizes changed

Jump resolution is one reason compilation is multi-stage.

25.35 Inline Caches and Specialization Support

Modern CPython bytecode supports adaptive specialization.

The compiler may emit cache entries associated with instructions.

Example operations benefiting from specialization:

attribute access
global lookup
binary operations
calls
method dispatch

Initial bytecode is generic.

Runtime specialization may later replace behavior with optimized fast paths.

The compiler prepares instruction layouts that allow this adaptation.

25.36 Bytecode Inspection

Use dis to inspect generated bytecode.

Example:

import dis

def f(a, b):
    return a + b

dis.dis(f)

Useful inspection functions include:

dis.dis
dis.Bytecode
dis.get_instructions

Bytecode inspection is essential for:

compiler debugging
performance analysis
tooling
education
reverse engineering Python behavior

25.37 Version Sensitivity

Bytecode changes between CPython versions.

Changes may include:

new opcodes
removed opcodes
opcode renaming
instruction format changes
specialization changes
exception handling changes
line table changes

Tools should avoid depending on exact bytecode layouts across versions unless version-specific support exists.

Use public APIs rather than parsing raw bytecode manually whenever possible.

25.38 Important CPython Source Areas

Important files include:

Python/compile.c
Python/flowgraph.c
Python/assemble.c
Python/bytecodes.c
Include/opcode_ids.h
Lib/dis.py
Lib/opcode.py

Conceptual roles:

AreaRole
compile.cAST traversal and instruction emission
flowgraph.cControl-flow graph handling
assemble.cBytecode assembly and jump resolution
bytecodes.cOpcode definitions
opcode.pyOpcode metadata
dis.pyBytecode inspection

25.39 Minimal Mental Model

Use this model:

The compiler walks the AST.
Expressions emit stack-based instructions.
Statements emit side-effect and control-flow instructions.
Constants, names, and locals become indexed table references.
Control flow becomes jumps and exception metadata.
Functions, classes, comprehensions, and generators create nested code objects.
The final instruction stream becomes part of a code object executed by the CPython virtual machine.

Bytecode generation is the stage where Python syntax becomes executable virtual machine operations.