# 21. Symbol Tables

# 21. Symbol Tables

After parsing produces an AST, CPython performs scope analysis.

This stage builds symbol tables.

A symbol table records how names behave inside each scope. It determines whether a name is local, global, free, cell, parameter, imported, annotated, or referenced from nested scopes.

For this source:

```python
x = 10

def outer():
    y = 20

    def inner():
        return x + y

    return inner
```

the parser only knows that names appear in expressions and assignments.

The symbol table phase determines:

```text
x       global from inner
y       free variable in inner
y       cell variable in outer
inner   local variable in outer
outer   global variable in module
```

This analysis is essential because bytecode generation depends on it. Loading a local variable uses different bytecode from loading a global or closure variable.

## 21.1 Position in the Compilation Pipeline

The symbol table phase sits between AST construction and bytecode generation.

```text
source
    ↓
tokenization
    ↓
parsing
    ↓
AST
    ↓
symbol table analysis
    ↓
compiler
    ↓
code object
    ↓
bytecode execution
```

The parser builds syntax structure.

The symbol table phase builds scope structure.

The compiler then uses both.

## 21.2 What a Symbol Table Contains

Each scope has a symbol table.

A symbol table stores information such as:

```text
which names exist
where names are assigned
where names are read
whether names are parameters
whether names are imported
whether names are global
whether names are nonlocal
whether names escape into nested scopes
whether names require closure cells
```

Conceptually:

```text
Scope: function outer

Symbols:
    y       local + cell
    inner   local
```

```text
Scope: function inner

Symbols:
    x       global
    y       free
```

The symbol table phase does not execute code. It performs static scope classification.

## 21.3 Python Uses Lexical Scoping

Python uses lexical scoping.

A nested function can access names from enclosing scopes.

Example:

```python
def outer():
    x = 10

    def inner():
        return x

    return inner
```

`inner` can access `x` because `x` exists in the enclosing lexical environment.

The symbol table phase determines this relationship before runtime execution.

Without lexical scope analysis, CPython would not know:

```text
where to look for a name
whether to allocate closure storage
whether a name belongs in locals
whether a variable escapes into nested functions
```

## 21.4 Scopes in Python

Python has several scope types.

| Scope               | Example           |
| ------------------- | ----------------- |
| Module scope        | top-level file    |
| Function scope      | `def f():`        |
| Class scope         | `class C:`        |
| Comprehension scope | `[x for x in xs]` |
| Generator scope     | `(x for x in xs)` |
| Lambda scope        | `lambda x: x + 1` |

Each scope has independent symbol analysis.

Example:

```python
x = 1

def f():
    x = 2
    return x
```

The module and function each have their own `x`.

The symbol table phase separates them.

## 21.5 Module Scope

Top-level code executes in module scope.

Example:

```python
x = 1
y = 2

def add():
    return x + y
```

At module scope:

```text
x       global/module-local
y       global/module-local
add     global/module-local
```

Inside `add`:

```text
x       global
y       global
```

At runtime, globals are stored in the module dictionary.

The symbol table phase determines that references inside `add` should use global lookup bytecode rather than local lookup bytecode.

## 21.6 Function Scope

Function bodies create local scopes.

Example:

```python
def f(a, b):
    c = a + b
    return c
```

Inside `f`:

```text
a       parameter + local
b       parameter + local
c       local
```

Parameters are treated as initialized local variables.

The compiler later allocates slots for locals inside the frame object.

The symbol table phase determines:

```text
how many local variables exist
which names are parameters
which bytecode instructions to emit
```

## 21.7 Local Variables

A name assigned anywhere in a function becomes local unless declared otherwise.

Example:

```python
x = 10

def f():
    print(x)
    x = 20
```

This raises:

```text
UnboundLocalError
```

Why?

Because the symbol table phase sees assignment to `x` inside `f`:

```python
x = 20
```

That means:

```text
x is local to f
```

So earlier:

```python
print(x)
```

tries to read the local `x` before assignment.

This behavior comes from static scope analysis, not runtime guessing.

## 21.8 Global Declarations

`global` changes scope classification.

Example:

```python
x = 10

def f():
    global x
    x = 20
```

Now the symbol table records:

```text
x       global
```

inside `f`.

The assignment updates the module-level variable.

Without `global`, assignment would create a local variable.

The symbol table phase must process `global` declarations before compiling name accesses.

## 21.9 Nonlocal Declarations

`nonlocal` refers to enclosing function scope variables.

Example:

```python
def outer():
    x = 10

    def inner():
        nonlocal x
        x += 1

    inner()
    return x
```

Inside `inner`:

```text
x       free variable
```

Inside `outer`:

```text
x       cell variable
```

A cell variable is a local variable captured by nested scopes.

Without `nonlocal`, assignment inside `inner` would create a new local variable.

`nonlocal` tells the symbol table:

```text
do not create a local binding
use enclosing function binding
```

## 21.10 Free Variables

A free variable is a name used inside a scope but defined in an enclosing function scope.

Example:

```python
def outer():
    x = 10

    def inner():
        return x
```

Inside `inner`:

```text
x       free variable
```

`x` is not local to `inner`.

It comes from `outer`.

Free variables require closure support.

## 21.11 Cell Variables

A cell variable is a local variable referenced by nested scopes.

Example:

```python
def outer():
    x = 10

    def inner():
        return x
```

Inside `outer`:

```text
x       local + cell
```

Why?

Because `x` must survive after `outer` returns.

The nested function still needs access to it.

CPython stores captured variables in heap-allocated closure cells rather than ordinary stack-only locals.

The symbol table phase determines which locals need cell storage.

## 21.12 Closures

Closures depend directly on symbol table analysis.

Example:

```python
def make_counter():
    count = 0

    def inc():
        nonlocal count
        count += 1
        return count

    return inc
```

The symbol table determines:

```text
count in make_counter:
    local + cell

count in inc:
    free
```

The compiler then generates closure machinery.

At runtime:

```python
c = make_counter()
```

returns a function carrying access to the captured `count` cell.

Without symbol table analysis, closure construction would be impossible.

## 21.13 Name Resolution Categories

CPython broadly classifies names into categories.

| Category        | Meaning                          |
| --------------- | -------------------------------- |
| Local           | Defined in current scope         |
| Global explicit | Declared with `global`           |
| Global implicit | Resolved at module/builtin level |
| Free            | Comes from enclosing scope       |
| Cell            | Local captured by nested scope   |

This classification controls bytecode generation.

Example bytecode categories:

| Category    | Bytecode family |
| ----------- | --------------- |
| Local       | `LOAD_FAST`     |
| Global      | `LOAD_GLOBAL`   |
| Free/cell   | `LOAD_DEREF`    |
| Name lookup | `LOAD_NAME`     |

Different scope decisions produce different runtime behavior and performance.

## 21.14 Bytecode Depends on Scope Analysis

Example:

```python
def f(a):
    return a
```

Bytecode:

```text
LOAD_FAST a
RETURN_VALUE
```

`a` is local.

Now:

```python
x = 1

def f():
    return x
```

Bytecode:

```text
LOAD_GLOBAL x
RETURN_VALUE
```

Now closure case:

```python
def outer():
    x = 1

    def inner():
        return x
```

Inside `inner`:

```text
LOAD_DEREF x
RETURN_VALUE
```

The parser alone cannot determine these instructions.

The symbol table phase provides the required scope information.

## 21.15 Class Scope

Class bodies have special scope behavior.

Example:

```python
x = 1

class C:
    y = x
```

Inside the class body:

```text
y       local to class namespace
x       global
```

Class bodies execute in their own namespace dictionary.

But methods do not automatically capture class-scope variables lexically.

Example:

```python
class C:
    x = 1

    def f(self):
        return x
```

This does not access `C.x`.

The name `x` inside `f` is treated as global unless otherwise defined.

Class scope differs from function scope in important ways.

## 21.16 Comprehension Scopes

Modern Python comprehensions create their own scope.

Example:

```python
x = 100

values = [x for x in range(3)]

print(x)
```

Output:

```text
100
```

The comprehension variable does not leak into outer scope.

Internally:

```text
comprehension creates nested scope
x inside comprehension is local there
outer x remains unchanged
```

Earlier Python versions behaved differently. Modern CPython uses isolated comprehension scopes.

## 21.17 Generator Expression Scopes

Generator expressions also create nested scopes.

Example:

```python
x = 10

g = (x for x in range(3))

print(x)
```

Outer `x` remains unchanged.

The generator expression behaves similarly to an implicit nested function.

The symbol table phase creates separate symbol information for generator scopes.

## 21.18 Lambda Scopes

Lambda expressions create function scopes.

Example:

```python
x = 10

f = lambda y: x + y
```

Inside the lambda:

```text
y       local parameter
x       free variable
```

Lambdas are anonymous functions, but symbol table analysis treats them similarly to ordinary nested functions.

## 21.19 Imports and Symbol Tables

Imports also affect symbol tables.

Example:

```python
import os
from math import sin
```

Module scope symbols:

```text
os      imported local/global
sin     imported local/global
```

Inside functions:

```python
def f():
    import json
```

`json` becomes local to `f`.

Imports are assignments from the perspective of scope analysis.

## 21.20 Annotations

Variable annotations participate in symbol handling.

Example:

```python
x: int = 1
```

The symbol table records:

```text
x       assigned
```

Function annotations also appear:

```python
def f(x: int) -> str:
    return str(x)
```

The symbol table phase tracks annotation-related scope usage, especially because annotations may reference names.

## 21.21 Exception Handler Variables

Exception handler names create local bindings.

Example:

```python
try:
    run()
except ValueError as e:
    print(e)
```

Inside the exception block:

```text
e       local
```

CPython later clears exception variables after the handler to avoid reference cycles involving tracebacks.

The symbol table phase identifies the binding itself. Cleanup behavior occurs later.

## 21.22 Pattern Matching Bindings

Pattern matching introduces binding rules.

Example:

```python
match value:
    case [x, y]:
        print(x, y)
```

Pattern variables become local bindings.

The symbol table phase must distinguish:

```text
pattern capture
ordinary expression
```

because pattern syntax has different semantics.

## 21.23 AST Traversal During Symbol Analysis

The symbol table phase walks the AST.

Simplified model:

```text
visit Module
    create module scope

visit FunctionDef
    register function name
    create child scope
    visit parameters
    visit body

visit Assign
    mark targets as assigned
    visit expression

visit Name
    mark as load/store/delete

visit Global
    record explicit global declaration

visit Nonlocal
    record nonlocal declaration
```

The traversal collects information first, then resolves scope relationships.

## 21.24 Two-Pass Nature of Scope Analysis

Scope analysis is not purely local.

Example:

```python
def f():
    print(x)
    x = 1
```

The meaning of the first `x` depends on later assignment.

Therefore CPython cannot classify names in a single left-to-right pass.

The symbol table phase effectively gathers scope information across the entire block before final classification.

## 21.25 Nested Scope Trees

Scopes form a tree.

Example:

```python
x = 0

def outer():
    y = 1

    def inner():
        z = 2
```

Scope tree:

```text
Module
    outer
        inner
```

Each scope stores links to child scopes.

Free variable resolution walks outward through enclosing scopes.

## 21.26 Builtins

If a name is neither local nor global module-defined, runtime lookup may fall back to builtins.

Example:

```python
print(len([1, 2, 3]))
```

Inside module scope:

```text
print     implicit global/builtin
len       implicit global/builtin
```

The symbol table phase does not resolve actual builtin objects. It classifies lookup style.

At runtime, global lookup checks:

```text
module globals
then builtins
```

## 21.27 The `symtable` Module

Python exposes symbol tables through the `symtable` module.

Example:

```python
import symtable

src = """
x = 1

def outer():
    y = 2

    def inner():
        return x + y
"""

table = symtable.symtable(src, "<input>", "exec")

print(table.get_identifiers())
```

Nested symbol tables can be inspected programmatically.

This module is useful for:

```text
linters
static analyzers
teaching scope behavior
compiler experiments
```

## 21.28 Symbol Table Data Structures in CPython

Important source areas include:

```text
Python/symtable.c
Include/internal/
Python/compile.c
```

The symbol table machinery stores:

```text
scope type
symbol flags
child scopes
free variables
cell variables
parameter information
optimization flags
```

Symbol flags describe behavior such as:

```text
assigned
used
parameter
global
nonlocal
free
cell
imported
annotated
```

The compiler later consumes this metadata.

## 21.29 Optimization and Locals

Scope analysis enables fast local variable access.

Local variables use indexed slots inside frames.

Example:

```python
def f(a, b):
    c = a + b
    return c
```

Locals can be compiled into array-style accesses:

```text
locals[0] -> a
locals[1] -> b
locals[2] -> c
```

This is much faster than dictionary lookup.

Without symbol table analysis, CPython could not precompute local layouts.

## 21.30 Minimal Mental Model

Use this model:

```text
The parser builds syntax structure.
The symbol table phase builds scope structure.
Each module, function, class, lambda, and comprehension creates a scope.
Names are classified as local, global, free, or cell variables.
Nested functions create closure relationships.
The compiler uses symbol table information to generate correct bytecode.
Fast locals, globals, and closures all depend on symbol analysis.
```

The symbol table phase is where CPython turns syntax into executable scope semantics.

