26. Bytecode Optimization

Bytecode optimization is the final cleanup and refinement stage before a code object is ready for execution.

Earlier compiler stages decide what the program means and emit instructions. Optimization tries to make those instructions smaller, simpler, or faster while preserving Python semantics.

For example:

x = 1 + 2

can compile as if written:

x = 3

because both operands are constants and integer addition has no user-visible side effect in this case.

But this:

x = a + b

cannot be folded because a and b are runtime values. Their types may define arbitrary behavior for +.

26.1 Position in the Compilation Pipeline

Optimization happens after or during bytecode generation, depending on the kind of optimization.

source
    ↓
tokenization
    ↓
parsing
    ↓
AST
    ↓
symbol table
    ↓
instruction generation
    ↓
optimization
    ↓
assembly
    ↓
code object

Some optimizations happen on the AST. Some happen on the control-flow graph or instruction stream. Some happen later at runtime through specialization.

Do not treat “optimization” as one single pass. CPython uses several mechanisms at different layers.

26.2 Optimization Must Preserve Semantics

Python is highly dynamic.

This limits what CPython can safely optimize.

Example:

x = 1 + 2

Safe to fold.

Example:

x = "a" + "b"

Safe to fold.

Example:

x = a + b

Unsafe to fold.

The operation may call:

a.__add__(b)

or:

b.__radd__(a)

Those methods can run arbitrary Python code.

Optimization rule:

Only remove or precompute work when the observable behavior remains the same.

Observable behavior includes:

return values
exceptions
side effects
evaluation order
name lookup behavior
attribute lookup behavior
object identity when relevant
traceback positions
debugging behavior where specified

26.3 Constant Folding

Constant folding precomputes expressions made only from constants.

Examples:

1 + 2
2 * 3
"py" + "thon"
(1, 2) + (3, 4)

These can become constants.

Example:

def f():
    return 1 + 2

Possible bytecode:

LOAD_CONST 3
RETURN_VALUE

rather than:

LOAD_CONST 1
LOAD_CONST 2
BINARY_OP +
RETURN_VALUE

The compiler stores the folded value in co_consts.

26.4 Conservative Folding

CPython avoids folding expressions that could be too large or too expensive at compile time.

Example:

x = "a" * 10

may be folded.

But a huge repetition should not create an enormous constant during compilation.

x = "a" * 10_000_000_000

A compiler that blindly folded this would consume huge memory before the program even starts running.

Optimization must be bounded.

CPython’s optimizer is practical. It avoids turning compilation into an unexpected resource attack.

26.5 Tuple and Container Folding

Tuples containing only constants can be folded.

Example:

x = (1, 2, 3)

can load a single tuple constant.

But lists and sets are mutable, so they cannot generally become constants.

x = [1, 2, 3]

must create a new list at runtime.

Why?

Each execution needs a fresh mutable object.

This function must return a different list object each call:

def f():
    return [1, 2, 3]

If the compiler stored one shared list constant, mutation would leak across calls.

26.6 Membership Test Optimization

For membership tests, mutable literals can sometimes be replaced by immutable constants.

Example:

def is_small(x):
    return x in {1, 2, 3}

The set literal is used only for membership testing. CPython may replace it with a frozenset constant.

Conceptually:

LOAD_FAST x
LOAD_CONST frozenset({1, 2, 3})
CONTAINS_OP
RETURN_VALUE

This avoids rebuilding the set every time the function runs.

The optimization is valid because the set object itself is not exposed to user code.

26.7 Dead Code After Unconditional Exit

Some code becomes unreachable after an unconditional exit.

Example:

def f():
    return 1
    x = 2

The assignment cannot execute.

The compiler may remove or skip unreachable instruction sequences.

Unconditional exits include:

return
raise
break
continue

within the right control-flow context.

The compiler must still preserve line information and diagnostics where required. Optimization cannot freely erase everything if it changes debugging or syntax-related behavior in visible ways.

26.8 Jump Optimization

Control flow often creates jumps to jumps.

Example shape:

JUMP_FORWARD label_a

label_a:
JUMP_FORWARD label_b

label_b:
...

The optimizer can redirect the first jump directly to label_b.

JUMP_FORWARD label_b

label_b:
...

This reduces instruction count and avoids unnecessary dispatch.

Jump optimization is usually safe because it preserves the same control-flow destination.

26.9 Basic Block Cleanup

The compiler builds or uses basic blocks internally.

A basic block is a straight-line instruction sequence with one entry and one exit.

Optimization can remove empty blocks, merge adjacent blocks, or simplify edges.

Example:

block A:
    jump block B

block B:
    jump block C

block C:
    useful work

can become:

block A:
    jump block C

block C:
    useful work

This cleanup happens before final assembly, when labels are still symbolic.

26.10 Stack Effect Preservation

Every optimization must preserve stack correctness.

Example:

LOAD_CONST 1
POP_TOP

If the loaded constant has no side effect and its value is discarded, this pair could be removed.

But this:

LOAD_NAME x
POP_TOP

cannot generally be removed.

Loading x can raise NameError.

Likewise:

LOAD_ATTR attr
POP_TOP

can call dynamic attribute lookup and raise exceptions.

The optimizer must know which operations are safe to remove.

26.11 Evaluation Order Constraints

Python evaluation order is observable.

Example:

f() + g()

f() must run before g().

Even if the result of f() seems unused in some transformed form, the call cannot be skipped or reordered.

Example:

items[key()] = value()

The calls happen in a specified order. The compiler must preserve that order.

Optimization cannot treat Python expressions like pure mathematical expressions.

26.12 Name Lookup Constraints

Name lookup is dynamic.

Example:

len([1, 2, 3])

CPython cannot generally replace this with 3.

Why?

At module level, len may be rebound:

len = lambda x: 999

Inside a function, len may resolve through globals before builtins.

The compiler does not assume that the builtin len is unchanged.

This is a key difference between Python optimization and optimization in statically linked languages.

26.13 Attribute Lookup Constraints

Attribute lookup is dynamic.

Example:

obj.x

This can invoke:

__getattribute__
__getattr__
descriptors
properties
module-level __getattr__
custom metaclass logic

The compiler cannot replace repeated attribute access with a cached value unless it can prove that behavior is unchanged.

Runtime specialization may optimize common cases speculatively, but that is guarded by runtime checks. Compile-time optimization remains conservative.

26.14 `assert` Optimization

assert statements are affected by optimization mode.

Example:

assert x > 0, "x must be positive"

Normally compiles roughly as:

if not (x > 0):
    raise AssertionError("x must be positive")

When Python runs with optimization enabled, such as python -O, assert statements are removed.

This changes behavior intentionally. It is part of Python’s documented optimization mode.

Practical rule:

Do not use assert for required runtime validation.

Use explicit exceptions for checks that must always run.

26.15 `debug`

The special constant __debug__ is tied to optimization mode.

Normally:

__debug__ == True

Under python -O:

__debug__ == False

The compiler can optimize branches depending on __debug__.

Example:

if __debug__:
    check()

In optimized mode, this branch can be omitted.

__debug__ is special. Ordinary names do not receive this treatment.

26.16 Docstring Handling

Optimization levels can affect docstrings.

Example:

def f():
    """Function docs."""
    return 1

Normally, the first string literal in a function body becomes the function docstring.

With higher optimization, such as python -OO, docstrings may be removed.

This affects:

f.__doc__

The compiler and runtime treat docstrings specially because they are stored as object metadata rather than executed like ordinary expression statements.

26.17 Peephole-Style Optimization

Older descriptions of CPython often mention a peephole optimizer.

A peephole optimizer looks at a small window of instructions and replaces inefficient patterns.

Example idea:

LOAD_CONST 1
LOAD_CONST 2
BUILD_TUPLE 2

can become:

LOAD_CONST (1, 2)

Modern CPython has moved much optimization into AST and control-flow/instruction stages rather than relying only on a simple peephole pass.

The concept remains useful: some optimizations are local rewrites over a small instruction pattern.

26.18 AST-Level Optimization

Some optimizations are easier before bytecode exists.

Example:

x = 1 + 2

The AST contains:

BinOp(Constant(1), Add, Constant(2))

An AST optimizer can replace that node with:

Constant(3)

Then bytecode generation simply loads 3.

AST-level optimization sees high-level structure, which can be easier than reconstructing meaning from bytecode.

26.19 Instruction-Level Optimization

Other optimizations are easier after instruction generation.

Examples:

remove unreachable blocks
redirect jumps
merge blocks
compute final stack sizes
shorten control flow

Instruction-level optimization sees actual control flow and stack effects.

It is closer to the final bytecode layout.

26.20 Runtime Specialization Is Separate

Modern CPython also has runtime specialization.

This is not the same as compile-time optimization.

The compiler emits generic bytecode. During execution, CPython may specialize frequently used instructions.

Example operations:

LOAD_ATTR
LOAD_GLOBAL
BINARY_OP
CALL
STORE_ATTR

A generic attribute access may become a faster specialized path after CPython observes stable runtime behavior.

Compile-time optimization:

happens before execution
must be safe without runtime knowledge

Runtime specialization:

happens during execution
uses observed runtime types and cache guards
can fall back if assumptions fail

This separation allows CPython to optimize dynamic code without making unsafe compile-time assumptions.

26.21 Inline Caches

Runtime specialization uses inline caches near bytecode instructions.

Conceptually:

LOAD_ATTR name
CACHE
CACHE

The cache stores runtime information such as:

observed type
dictionary version
descriptor information
resolved offset
specialized handler state

The compiler prepares space for caches. The interpreter fills and uses them later.

This is why disassembly may show cache entries when requested.

26.22 Optimization and Tracebacks

Optimization must preserve useful source positions.

Example:

x = 1 / 0

Even if surrounding code is optimized, a ZeroDivisionError should point to a meaningful source location.

Tracebacks rely on bytecode-to-source mappings.

The optimizer must update or preserve:

line information
column information
exception table ranges
instruction offsets

Broken metadata produces bad debugging behavior.

26.23 Optimization and Exceptions

Some expressions look simple but can raise exceptions.

Example:

1 / 0

The compiler should not fold this into a compile-time exception during normal compilation.

The exception must occur when the code executes, not when it is compiled.

Example:

def f():
    return 1 / 0

Defining f should succeed. Calling f() should raise ZeroDivisionError.

Therefore constant folding must avoid transformations that move exceptions from runtime to compile time.

26.24 Optimization and Object Identity

Object identity can be observable through is.

Example:

x is y

The compiler must avoid transformations that change identity-sensitive behavior.

For immutable constants, CPython may reuse objects, intern strings, or fold constants in ways allowed by implementation behavior. But code should not rely on all constant identity details.

Example:

(1, 2) is (1, 2)

The result may depend on compilation and implementation choices.

Optimization must preserve language semantics, while some identity behavior for constants remains implementation-specific.

26.25 Optimization and Mutable Defaults

Function default values are evaluated at function definition time.

Example:

def f(xs=[]):
    xs.append(1)
    return xs

The list default is a runtime object stored on the function object.

The compiler cannot replace this with a fresh list per call because that would change semantics.

This is not an optimization bug. It is Python’s function definition behavior.

Optimization must respect when objects are created.

26.26 Optimizing Conditional Constants

Constant conditions can simplify control flow.

Example:

if False:
    x = 1
else:
    x = 2

The compiler may omit the unreachable branch.

Similarly:

while False:
    work()

The body cannot execute.

But the optimizer must be careful with source metadata and with constructs that affect scope or syntax.

Example:

if False:
    import never_loaded

The import does not execute, but the presence of assignments in unreachable blocks can still affect local variable classification because symbol table analysis happens before optimization.

26.27 Scope Is Decided Before Optimization

Symbol table analysis sees the whole AST before optimization removes unreachable code.

Example:

x = 1

def f():
    if False:
        x = 2
    return x

This can still make x local to f, causing an unbound local behavior when return x executes.

The optimizer cannot simply erase the assignment and then reclassify x as global unless the whole compilation model allowed such a semantic change.

Scope classification is part of Python semantics and happens before many optimizations.

26.28 Optimization Levels

CPython supports optimization levels.

Common modes:

Mode	Meaning
normal	Default compilation
`-O`	Remove assert statements, set `__debug__` to `False`
`-OO`	Also remove docstrings

At Python level:

import sys

print(sys.flags.optimize)

Compilation APIs can also receive optimization settings.

Example:

compile("assert False", "<input>", "exec", optimize=1)

Optimization level is part of compilation behavior.

26.29 Limits of Compile-Time Optimization

CPython intentionally avoids aggressive compile-time optimization.

It does not generally perform:

function inlining
global constant propagation
type-specialized compilation
loop unrolling
escape analysis
whole-program optimization
automatic vectorization

Reasons:

dynamic name lookup
dynamic attribute access
monkey patching
import-time side effects
debuggability
compile-time cost
implementation simplicity
semantic risk

CPython instead relies on:

simple safe compile-time optimizations
specialized C implementations of built-in types
runtime adaptive specialization
extension modules
external JIT implementations in other runtimes

26.30 Important CPython Source Areas

Important source files include:

Python/ast_opt.c
Python/compile.c
Python/flowgraph.c
Python/assemble.c
Python/bytecodes.c
Lib/dis.py
Lib/test/test_peepholer.py
Lib/test/test_compile.py

Conceptual roles:

Area	Role
`ast_opt.c`	AST-level optimization
`compile.c`	AST-to-instruction generation
`flowgraph.c`	Control-flow graph work
`assemble.c`	Final code assembly
`bytecodes.c`	Opcode definitions and runtime support
`dis.py`	Bytecode inspection
tests	Protect optimizer behavior

26.31 Minimal Mental Model

Use this model:

CPython optimization is conservative.
It folds safe constants.
It simplifies some containers and membership tests.
It removes or redirects some unreachable or redundant control flow.
It respects evaluation order, exceptions, dynamic lookup, scope rules, and source positions.
Compile-time optimization differs from runtime specialization.
Runtime specialization can use observed types and guarded inline caches.

Bytecode optimization improves generated code while staying inside Python’s dynamic semantics.

26. Bytecode Optimization

26.1 Position in the Compilation Pipeline

26.2 Optimization Must Preserve Semantics

26.3 Constant Folding

26.4 Conservative Folding

26.5 Tuple and Container Folding

26.6 Membership Test Optimization

26.7 Dead Code After Unconditional Exit

26.8 Jump Optimization

26.9 Basic Block Cleanup

26.10 Stack Effect Preservation

26.11 Evaluation Order Constraints

26.12 Name Lookup Constraints

26.13 Attribute Lookup Constraints

26.14 assert Optimization

26.15 __debug__

26.16 Docstring Handling

26.17 Peephole-Style Optimization

26.18 AST-Level Optimization

26.19 Instruction-Level Optimization

26.20 Runtime Specialization Is Separate

26.21 Inline Caches

26.22 Optimization and Tracebacks

26.23 Optimization and Exceptions

26.24 Optimization and Object Identity

26.25 Optimization and Mutable Defaults

26.26 Optimizing Conditional Constants

26.27 Scope Is Decided Before Optimization

26.28 Optimization Levels

26.29 Limits of Compile-Time Optimization

26.30 Important CPython Source Areas

26.31 Minimal Mental Model

26.14 `assert` Optimization

26.15 `debug`