# 54. `dis`

# 54. `dis`

The `dis` module exposes CPython bytecode in a readable form. It is the standard tool for seeing what CPython actually executes after source code has passed through the compiler.

Python source code is not executed directly. CPython compiles it into code objects. Each code object contains bytecode, constants, names, local variable metadata, exception handling metadata, and source position information. The `dis` module reads those code objects and formats their bytecode instructions.

## 54.1 The Role of `dis`

`dis` answers a narrow but important question:

```text
What bytecode did CPython generate for this code?
```

Example:

```python
import dis

def add(a, b):
    return a + b

dis.dis(add)
```

The output shows bytecode instructions for the function body.

A typical disassembly contains:

```text
source line numbers
bytecode offsets
instruction names
instruction operands
resolved operand meanings
jump targets
exception table information
```

The exact output depends on the Python version. CPython bytecode is not a stable external interface. It changes as the compiler and interpreter evolve.

## 54.2 Source Code to Bytecode

The `dis` module sits at the end of the front-end compilation pipeline.

```text
source text
    ↓
tokens
    ↓
parser
    ↓
AST
    ↓
symbol table
    ↓
compiler
    ↓
code object
    ↓
disassembly
```

For this function:

```python
def square(x):
    return x * x
```

CPython creates a function object. That function object contains a code object:

```python
print(square.__code__)
```

The bytecode lives inside:

```python
square.__code__.co_code
```

`co_code` is a bytes object. It stores encoded instructions. Reading it directly is possible, but unpleasant. `dis` decodes it into instruction records.

## 54.3 Code Objects

`dis` works mainly on code objects.

You can pass a function directly:

```python
dis.dis(square)
```

or pass its code object:

```python
dis.dis(square.__code__)
```

A code object contains fields such as:

| Field | Meaning |
|---|---|
| `co_code` | Raw bytecode bytes |
| `co_consts` | Constants used by the code |
| `co_names` | Global and attribute names |
| `co_varnames` | Local variable names |
| `co_cellvars` | Cell variables created for nested functions |
| `co_freevars` | Free variables captured from outer scopes |
| `co_filename` | Source filename |
| `co_name` | Code object name |
| `co_qualname` | Qualified code object name |
| `co_firstlineno` | First source line |
| `co_flags` | Execution flags |
| `co_stacksize` | Required value stack size |

The bytecode references these tables by index. For example, an instruction that loads a constant usually stores an integer operand. That operand indexes into `co_consts`.

Conceptually:

```text
LOAD_CONST 1
    ↓
co_consts[1]
```

## 54.4 Instructions

`dis` exposes bytecode as instructions.

At the Python level, each decoded instruction can be represented by `dis.Instruction`.

Example:

```python
import dis

def f(x):
    return x + 1

for instr in dis.get_instructions(f):
    print(instr)
```

An instruction record includes:

| Attribute | Meaning |
|---|---|
| `opname` | Human-readable opcode name |
| `opcode` | Numeric opcode value |
| `arg` | Raw integer argument |
| `argval` | Resolved argument value |
| `argrepr` | Display form of argument |
| `offset` | Bytecode offset |
| `starts_line` | Source line associated with instruction |
| `is_jump_target` | Whether another instruction jumps here |
| `positions` | Source span information |

This object form is better than parsing printed `dis.dis()` output.

## 54.5 The Evaluation Stack

CPython bytecode is stack-based.

Most instructions push values onto a stack, pop values from it, or transform values already on it.

Example:

```python
def f(a, b):
    return a + b
```

Conceptual bytecode behavior:

```text
LOAD_FAST a      push local a
LOAD_FAST b      push local b
BINARY_OP +      pop b, pop a, compute a + b, push result
RETURN_VALUE     pop result and return it
```

The stack is not the C call stack. It is an evaluation stack stored in the currently executing Python frame.

A frame contains:

```text
code object
instruction pointer
locals
globals
builtins
value stack
block and exception state
```

`dis` lets you see how source expressions are translated into stack operations.

## 54.6 Constants

Constants are stored in `co_consts`.

Example:

```python
def f():
    return 10 + 20
```

The compiler may fold constant expressions. Disassembly may show that the function simply loads `30`.

Conceptually:

```text
LOAD_CONST 30
RETURN_VALUE
```

Inspect the constants:

```python
print(f.__code__.co_consts)
```

Possible output shape:

```text
(None, 30)
```

The first constant is often `None`, because functions that fall off the end return `None`, and code objects commonly store it.

## 54.7 Local Variables

Local variables are stored in fast local slots.

Example:

```python
def f(x):
    y = x + 1
    return y
```

Relevant instructions may include:

```text
LOAD_FAST x
LOAD_CONST 1
BINARY_OP +
STORE_FAST y
LOAD_FAST y
RETURN_VALUE
```

`LOAD_FAST` and `STORE_FAST` use indexes into `co_varnames`.

```python
print(f.__code__.co_varnames)
```

Output shape:

```text
('x', 'y')
```

Fast locals are array-like slots, not ordinary dictionary lookups. This is one reason local variable access is faster than global variable access.

## 54.8 Globals and Builtins

Global names are resolved through module globals and then builtins.

Example:

```python
def f(x):
    return len(x)
```

Disassembly may include:

```text
LOAD_GLOBAL len
LOAD_FAST x
CALL
RETURN_VALUE
```

The name `len` is stored in `co_names`.

```python
print(f.__code__.co_names)
```

Output shape:

```text
('len',)
```

At runtime, `LOAD_GLOBAL` searches:

```text
function globals
    ↓
builtins
```

Modern CPython specializes global lookups with inline caches, but the semantic model remains the same.

## 54.9 Attribute Access

Attribute access compiles into bytecode that loads the base object and then performs attribute lookup.

Example:

```python
def f(obj):
    return obj.name
```

Conceptual instructions:

```text
LOAD_FAST obj
LOAD_ATTR name
RETURN_VALUE
```

The attribute name lives in `co_names`.

```python
print(f.__code__.co_names)
```

Output shape:

```text
('name',)
```

`LOAD_ATTR` connects bytecode execution to the descriptor protocol, instance dictionaries, slots, type dictionaries, and method resolution order.

So a single visible expression:

```python
obj.name
```

may invoke a large amount of runtime machinery.

## 54.10 Function Calls

Function calls compile into several bytecode operations.

Example:

```python
def f(g, x):
    return g(x)
```

Conceptual flow:

```text
load callable
load arguments
call callable
return result
```

Modern CPython uses call-oriented instructions designed to support fast paths, vectorcall, method calls, and specialization.

The exact instruction names differ across versions. Current disassembly may include instructions such as:

```text
PUSH_NULL
LOAD_FAST
PRECALL
CALL
RETURN_VALUE
```

or newer equivalents depending on the CPython version.

The important model:

```text
bytecode prepares callable and arguments
interpreter dispatches through call protocol
callee creates or reuses a frame if it is Python code
result is pushed onto the caller stack
```

## 54.11 Control Flow

Control flow compiles into jumps.

Example:

```python
def sign(x):
    if x < 0:
        return -1
    return 1
```

Conceptually:

```text
load x
load 0
compare <
jump if false to else path
load -1
return
load 1
return
```

`dis` marks jump targets and offsets.

Control flow features that generate jumps include:

```text
if statements
while loops
for loops
boolean operations
conditional expressions
try and exception handling
match statements
comprehensions
```

Jumps are bytecode-level changes to the instruction pointer.

## 54.12 Loops

A `for` loop compiles into iterator protocol operations.

Example:

```python
def total(xs):
    s = 0
    for x in xs:
        s += x
    return s
```

Conceptual operations:

```text
initialize s
load xs
get iterator
loop:
    get next item
    if exhausted, jump after loop
    store x
    update s
    jump loop
return s
```

This corresponds to Python’s iteration protocol:

```text
iter(obj)
next(iterator)
StopIteration ends loop
```

The bytecode hides exception details behind iterator instructions. The language-level loop is built from a small number of VM operations.

## 54.13 Exceptions

Exception handling has specialized bytecode and metadata.

Example:

```python
def f(x):
    try:
        return 10 / x
    except ZeroDivisionError:
        return None
```

Disassembly includes normal instructions plus exception table information.

Modern CPython uses exception tables rather than older block stack opcodes for much of exception control flow. This means the bytecode stream is cleaner, while exception ranges and handlers are stored separately.

Conceptually:

```text
protected bytecode range
    ↓ on exception
handler target
```

The `dis` output can show the exception table, which maps instruction ranges to handlers.

## 54.14 Comprehensions

A comprehension usually creates a nested code object.

Example:

```python
def f(xs):
    return [x * 2 for x in xs]
```

The outer function contains a constant that is itself a code object for the list comprehension.

```python
print(f.__code__.co_consts)
```

You can disassemble nested code objects manually:

```python
for const in f.__code__.co_consts:
    if isinstance(const, type(f.__code__)):
        dis.dis(const)
```

This matters because comprehensions have their own local scope.

Conceptually:

```text
outer function code object
    ↓
nested listcomp code object
```

## 54.15 Closures

Closures use cell variables and free variables.

Example:

```python
def outer(x):
    def inner():
        return x
    return inner
```

The outer function creates a cell for `x`. The inner function references it as a free variable.

Relevant code object fields:

```python
inner = outer(10)

print(outer.__code__.co_cellvars)
print(inner.__code__.co_freevars)
```

Output shape:

```text
('x',)
('x',)
```

Bytecode may include operations such as:

```text
MAKE_CELL
LOAD_CLOSURE
LOAD_DEREF
STORE_DEREF
```

These instructions manage variables captured across nested scopes.

## 54.16 Classes

A class statement also compiles into executable code.

Example:

```python
class User:
    kind = "user"

    def name(self):
        return "anonymous"
```

A class body is executed like a small program. It builds a namespace dictionary, stores attributes into it, and then calls the metaclass to produce the class object.

Conceptually:

```text
create namespace
execute class body code object
call type(name, bases, namespace)
bind resulting class to name
```

`dis` can reveal that class creation is not merely declarative. It is runtime execution.

## 54.17 Imports

Import statements compile into import bytecode.

Example:

```python
def f():
    import math
    return math.sqrt(9)
```

Conceptual operations:

```text
IMPORT_NAME math
STORE_FAST math
LOAD_FAST math
LOAD_ATTR sqrt
CALL
RETURN_VALUE
```

`IMPORT_NAME` connects bytecode to the import system: `sys.modules`, `sys.meta_path`, finders, loaders, package state, and import locks.

A single `import` statement therefore crosses from bytecode execution into the runtime import machinery.

## 54.18 Adaptive Bytecode and Specialization

Modern CPython includes an adaptive specializing interpreter.

The bytecode shown by `dis` can represent either baseline bytecode or, when requested, adaptive specialized bytecode.

The interpreter observes runtime behavior and may replace generic operations with specialized forms.

Example idea:

```text
generic LOAD_ATTR
    ↓ after repeated same-shape access
specialized LOAD_ATTR form
```

Specialization can optimize common cases such as:

```text
loading globals
loading attributes
binary operations
function calls
method calls
subscript operations
```

`dis` exposes options to show cache information and adaptive instructions in supported Python versions.

Example:

```python
import dis

def f(obj):
    return obj.x

dis.dis(f, show_caches=True)
```

Inline cache entries appear near the instructions that use them.

## 54.19 Inline Caches

Inline caches store runtime feedback next to bytecode instructions.

For example, an attribute lookup cache may remember details about the object type and dictionary version. If the same lookup pattern repeats, the interpreter can avoid some generic lookup work.

Conceptually:

```text
bytecode instruction
inline cache data
next instruction
inline cache data
```

`dis` can show cache slots when requested.

These cache entries are part of CPython’s performance machinery. They are not Python language semantics.

## 54.20 `dis.Bytecode`

`dis.Bytecode` provides an object-oriented wrapper.

Example:

```python
import dis

def f(x):
    return x + 1

bc = dis.Bytecode(f)

for instr in bc:
    print(instr.opname, instr.argrepr)
```

This is useful for tools that analyze bytecode programmatically.

Typical uses:

```text
teaching bytecode
debugging compiler output
building analyzers
checking generated code
studying specialization
```

For production tooling, prefer `dis.get_instructions()` or `dis.Bytecode` over parsing text output.

## 54.21 Stack Effects

`dis.stack_effect()` computes how an opcode changes stack depth.

Example:

```python
import dis

print(dis.stack_effect(dis.opmap["LOAD_CONST"], 0))
```

A stack effect describes:

```text
values pushed - values popped
```

For example:

| Instruction | Conceptual stack effect |
|---|---:|
| `LOAD_CONST` | `+1` |
| `STORE_FAST` | `-1` |
| `BINARY_OP` | `-1` |
| `RETURN_VALUE` | `-1` |

`BINARY_OP` pops two values and pushes one result, giving a net effect of `-1`.

Stack effect analysis is used by the compiler and by bytecode tools to reason about maximum stack size.

## 54.22 Raw Bytecode

`co_code` stores raw bytecode bytes.

Example:

```python
def f(x):
    return x + 1

print(f.__code__.co_code)
```

Do not assume raw bytecode format stability across Python versions.

Use `dis` for decoding:

```python
for instr in dis.get_instructions(f):
    print(instr.offset, instr.opname, instr.argrepr)
```

CPython’s bytecode is an implementation detail. It is appropriate for debugging, teaching, profiling, and CPython-specific tools, but not for long-term portable file formats.

## 54.23 Version Sensitivity

Bytecode changes frequently.

Between Python releases, CPython may change:

```text
opcode names
opcode meanings
instruction encoding
jump offsets
exception metadata
call protocol instructions
cache layout
specialized instructions
line number tables
```

Therefore, bytecode tools should check the Python version explicitly.

Example:

```python
import sys

print(sys.version_info)
```

A robust bytecode analyzer usually has version-specific logic.

## 54.24 Why `dis` Matters for CPython Internals

`dis` matters because it gives a direct view of the boundary between compiler and interpreter.

It helps answer questions such as:

```text
Did the compiler fold this constant?
Does this variable use fast locals or globals?
How does this loop become bytecode?
Where are jumps placed?
Does this comprehension create a nested code object?
What call protocol does this version use?
Where are inline caches attached?
What exception table did the compiler generate?
```

This makes `dis` one of the most useful learning tools for CPython.

## 54.25 Chapter Summary

The `dis` module is CPython’s bytecode inspection interface. It decodes code objects into readable instruction streams, exposing how source code is lowered into operations for the stack-based virtual machine.

Understanding `dis` connects several core internals: code objects, frames, the value stack, name resolution, function calls, exception handling, closures, comprehensions, imports, inline caches, and the specializing interpreter.
