# gopy compile pipeline

# 1620. Compile pipeline

## What we are porting

Twelve files from `cpython/Python/` (~39k lines) form the compiler that
turns an AST into a code object:

| C source                     | Lines | Go target                          |
|------------------------------|-------|------------------------------------|
| `asdl.c`                     |     6 | `ast/asdl.go`                      |
| `Python-ast.c` (generated)   | 18485 | `ast/nodes_gen.go`                 |
| `ast.c`                      |  1091 | `ast/validate.go`                  |
| `ast_preprocess.c`           |   990 | `ast/preprocess.go`                |
| `ast_unparse.c`              |  1029 | `ast/unparse.go`                   |
| `future.c`                   |   119 | `future/future.go`                 |
| `symtable.c`                 |  3266 | `symtable/symtable.go`             |
| `instruction_sequence.c`     |   483 | `compile/instrseq.go`              |
| `codegen.c`                  |  6483 | `compile/codegen.go`               |
| `flowgraph.c`                |  4165 | `compile/flowgraph.go`             |
| `assemble.c`                 |   802 | `compile/assemble.go`              |
| `compile.c`                  |  1753 | `compile/compiler.go`              |

The 1620 series runs after the parser hands us an AST and before the VM
runs the bytecode. It is the longest spec series in the project and the
one with the strongest source-shape parity requirement: the `dis.dis`
output of a compiled module must match CPython 3.14 byte-for-byte.

The parser itself (PEG, in `cpython/Parser/`) is a separate spec series.
v0.5 assumes `parser.Parse(src) (ast.Mod, error)` already exists.

## Layering

```
   parser.Parse  (separate spec)
         │
         ▼
   ast.Mod  (asdl-typed tree)
         │
   ast.Validate, ast.Preprocess
         │
         ▼
   future.FromAST   ───►  future flags bitmask
         │
         ▼
   symtable.Build   ───►  per-scope symbol tables
         │
         ▼
   compile.Compile
   ├─ codegen        (per-scope instruction sequence)
   ├─ flowgraph      (CFG, peephole optimizations)
   └─ assemble       (bytecode + line table + exception table)
         │
         ▼
   *objects.Code
```

Each stage is independently testable. The gate test at the v0.5
boundary calls `compile.Compile(parser.Parse("a = 1 + 2"))` and asserts
the disassembly matches CPython.

## ast package

### asdl.go

`asdl.c` is six lines: the macro-expanded `_PyAsdl_Sequence_New` for
generic, identifier, and int sequences. In Go this is just a type
parameter:

```go
package ast

// Seq is the asdl_seq* equivalent. CPython stores sequence length and
// elements inline; we use a Go slice.
type Seq[T any] []T
```

No dedicated allocator. The arena lives in `arena/` (v0.1).

### nodes_gen.go

`Python-ast.c` is generated by `Parser/asdl_c.py` from
`Parser/Python.asdl`. We do not port the C generator; we write a Go
generator (`tools/asdl_go`) that consumes the same `.asdl` file and
emits Go structs. Each AST node is a struct embedding a position. Sum
types (mod, stmt, expr, expr_context, etc.) become interfaces with a
sealed marker.

Examples:

```go
type Pos struct{ Lineno, ColOffset, EndLineno, EndColOffset int }

type Mod interface{ isMod() }
type Module struct {
    Pos
    Body        Seq[Stmt]
    TypeIgnores Seq[TypeIgnore]
}
func (*Module) isMod() {}

type Stmt interface{ isStmt() }
type Assign struct {
    Pos
    Targets   Seq[Expr]
    Value     Expr
    TypeComment string
}
func (*Assign) isStmt() {}

type ExprContext int
const (
    Load ExprContext = iota + 1
    Store
    Del
)
```

Constructors mirror `_PyAST_*`: `ast.NewModule(body, typeIgnores, pos)`,
`ast.NewAssign(targets, value, typeComment, pos)`, etc.

The full enumeration of mod/stmt/expr/pattern/type_param/excepthandler
sums plus the product types (arguments, arg, keyword, alias, withitem,
match_case, type_ignore) lives in the generator output. Enumerating
them here would duplicate the asdl source; treat `Parser/Python.asdl`
as authoritative.

### validate.go (`_PyAST_Validate`)

Pure tree walk that rejects malformed nodes:

* Position sanity. `lineno <= end_lineno`. Column ranges within the
  surrounding line. No negative offsets unless the canonical "no
  position" marker (`-1, -1, -1, -1`).
* Forbidden identifiers. `None`, `True`, `False` cannot appear as
  `Name`, `arg.arg`, `keyword.arg`, `alias.name` (except as the
  module name in `from None import` which is itself rejected
  elsewhere).
* `Constant.value` constrained to the marshallable subset (int, float,
  complex, str, bytes, bool, None, Ellipsis, tuple, frozenset of the
  same).
* Comprehensions have at least one generator.
* `expr_context` consistency. Targets in `Store`, deletions in `Del`,
  reads in `Load`. `Starred` in `Load` only as a top-level Call arg or
  as an iterable element.
* `Match` patterns shape: `MatchClass.cls` is a Name or Attribute,
  `MatchMapping` keys are constants or attribute lookups, etc.
* Type parameter constraints (PEP 695): `TypeVar.bound` cannot be a
  starred expression.

Returns `error` (CPython sets PyErr; we return).

### preprocess.go (`_PyAST_Preprocess`)

Three passes fused into one walk that mirrors `astfold_*` in
`ast_preprocess.c` for CPython 3.14 (note: arithmetic constant folding
on `BinOp` / `UnaryOp` / `Compare` / `BoolOp` of constants moved to
`flowgraph.c` LOAD_CONST chain folding in 3.14; do NOT fold those here):

1. Targeted constant folding. The 3.14 surface is small:
   * `string % tuple` printf-style format collapses to a `JoinedStr`
     when the format spec is supported (`%s`, `%r`, `%a` with optional
     width/precision) and the tuple has no `Starred`.
   * `Name("__debug__")` in `Load` context becomes
     `Constant(!optimize)`.
   * `MatchValue` / `MatchMapping` keys allow `-N` and `Const ± Const`
     so user code like `case -1` or `case 1+2j` still works after
     folding.
   * `__debug__` and the format-fold respect `syntax_check_only` (no
     mutation when only running validation).
2. PEP 765 finally checks. Walk every `Try` node and warn when a
   `finally` block contains `return`, `break`, or `continue`. Emitted
   as a `SyntaxWarning` via the v0.7 warnings module; v0.5 records the
   warnings on the compiler context.
3. Body cleanup.
   * Under `optimize >= 2`, the leading docstring is dropped (replaced
     with `Pass` if it was the only statement).
   * If a fold produced a string-typed expression at body position 0,
     wrap it in a single-value `JoinedStr` so the docstring detector
     does not re-treat it as a docstring.

Annotation expressions under `CO_FUTURE_ANNOTATIONS` (PEP 563) are
skipped: their text is captured by the unparser separately, the
constant fold has no business descending.

### unparse.go (`_PyAST_Unparse`)

Produces a Python source representation of an expression. Used for
PEP 649 deferred-evaluation annotations: a `def f(x: T)` retains the
unparsed annotation source string for the `__annotations__` dict.
Operator precedence drives parenthesization. Float infinity renders
as `1e309` per CPython. Format strings, interpolations, template
strings round-trip back to f-string / t-string syntax.

This is a literal port; the precedence table and per-node visitor
order match `ast_unparse.c` 1:1.

## future package

`future.go` ports `future.c`. Walks the module body past the docstring
collecting `from __future__ import x` statements. Produces:

```go
package future

type Features struct {
    Bits     uint32
    Location ast.Pos
}

const (
    Generators        = 1 << iota // CO_FUTURE_GENERATORS (legacy, always on)
    Division                       // CO_FUTURE_DIVISION
    AbsoluteImport                 // CO_FUTURE_ABSOLUTE_IMPORT
    WithStatement                  // CO_FUTURE_WITH_STATEMENT
    PrintFunction                  // CO_FUTURE_PRINT_FUNCTION
    UnicodeLiterals                // CO_FUTURE_UNICODE_LITERALS
    BarryAsBdfl                    // CO_FUTURE_BARRY_AS_BDFL
    GeneratorStop                  // legacy, ignored
    Annotations                    // CO_FUTURE_ANNOTATIONS (PEP 563)
)

func FromAST(mod ast.Mod) (*Features, error)
```

Errors mirror CPython: `from __future__ import braces` raises
`SyntaxError` with the exact same string. Future imports after the
first non-docstring/non-future statement raise SyntaxError. Unknown
feature names raise SyntaxError with the unknown name interpolated.

## symtable package

Port of `symtable.c`. `_PySymtable_Build` becomes `symtable.Build`.

```go
package symtable

type Block int
const (
    ModuleBlock Block = iota
    FunctionBlock
    ClassBlock
    AnnotationBlock
    TypeAliasBlock
    TypeParametersBlock
    TypeVariableBlock
)

type Scope int
const (
    Local Scope = iota + 1
    GlobalExplicit
    GlobalImplicit
    Free
    Cell
)

// SymbolFlags packs the DEF_* and USE bits plus the resolved Scope.
type SymbolFlags uint32

const (
    DefGlobal SymbolFlags = 1 << iota
    DefLocal
    DefParam
    DefNonlocal
    Use
    DefFreeClass
    DefImport
    DefAnnot
    DefCompIter
    DefTypeParam
    DefCompCell
)

type Entry struct {
    Type        Block
    Name        string
    Symbols     map[string]SymbolFlags
    Children    []*Entry
    Varnames    []string

    // Per-block attributes mirroring ste_*.
    Nested            bool
    Generator         bool
    Coroutine         bool
    Comprehension     bool
    Varargs           bool
    Varkeywords       bool
    ReturnsValue      bool
    NeedsClassClosure bool
    NeedsClassDict    bool

    Loc ast.Pos
}

type Table struct {
    Top    *Entry
    Blocks map[ast.Node]*Entry
    Future *future.Features
}

func Build(mod ast.Mod, filename string, ff *future.Features) (*Table, error)
```

Two-phase algorithm:

1. **Block visit**. Walk the AST building a tree of `Entry`. Each
   function/class/comprehension/lambda creates a child block. Every
   name use or definition records a flag bit on the enclosing block.
2. **Analysis**. Bottom-up over the block tree, resolve each name to
   one of `Local`, `GlobalExplicit`, `GlobalImplicit`, `Free`, `Cell`.
   Free-vars in nested scopes mark their defining scope's slot as
   `Cell`. Class scopes mediate (they can contribute `Cell` for
   methods but their own free names skip the class scope).

Error parity is non-negotiable: the exact `SyntaxError` strings ("name
'x' is used prior to global declaration", "no binding for nonlocal
'x' found", "name 'x' is assigned to before global declaration", etc.)
must match.

## compile package

### instrseq.go

Port of `instruction_sequence.c`. The pre-CFG instruction stream:

```go
package compile

type Label int // -1 means unbound.

type Instr struct {
    Op    Opcode
    Arg   int32
    Loc   ast.Pos
    // Exception handler info: index into the exc-handler stack at
    // emit time, set during codegen, finalized during assemble.
    ExcDepth int32
}

type Sequence struct {
    Instrs   []Instr
    LabelMap []int // label id to instr index, len == NewLabel calls.
    Nested   []*Sequence
}

func (s *Sequence) NewLabel() Label
func (s *Sequence) UseLabel(l Label)
func (s *Sequence) Addop(op Opcode, arg int32, loc ast.Pos)
func (s *Sequence) Insert(idx int, in Instr)
func (s *Sequence) ApplyLabelMap()
func (s *Sequence) AddNested(child *Sequence)
```

`Opcode` is generated from CPython's `Lib/opcode.py` (or, equivalently,
`Include/internal/pycore_opcode.h`). The opcode set is shared between
v0.5 and v0.6; the generator lives in `tools/opcodes_go` and emits
`compile/opcodes_gen.go` plus `vm/opcodes_gen.go`.

### codegen.go

> Detailed source-of-truth: `1626_gopy_codegen.md`. The summary below
> covers the cross-cutting picture; per-visitor file split, fblock
> stack, match panel, with-statement state machine, deferred
> annotations, PEP 695 type-parameter codegen, and the comprehensive
> per-visitor test plan live in 1626.

Port of `codegen.c` (~6500 lines). Walks each scope's AST and emits
into a `Sequence`. Per-node visitors mirror `compiler_visit_*`. The
naming convention is `genXxx` for the visitor of `Xxx`:

```go
func (c *compiler) genStmt(s ast.Stmt)
func (c *compiler) genExpr(e ast.Expr)
func (c *compiler) genCall(call *ast.Call)
// ... one per AST node kind.
```

Macro shorthands from CPython (`ADDOP`, `ADDOP_I`, `ADDOP_LOAD_CONST`,
`ADDOP_NAME`, `ADDOP_JUMP`, `ADDOP_COMPARE`) become methods on
`*compiler` that wrap `Sequence.Addop`.

Name resolution piggybacks on the symtable: the codegen looks up each
`Name` in the current `Entry.Symbols` to pick `LOAD_FAST` /
`LOAD_DEREF` / `LOAD_GLOBAL` / `LOAD_NAME`. Comprehensions inline into
the parent stream when they share a scope, and emit a nested sequence
otherwise.

The control-flow constructs (try/except/finally, with, async with,
match, comprehensions) are line-for-line ports. Each block-leaving
construct (return, break, continue, raise) honors the unwind protocol
by emitting the right `POP_BLOCK` / `RERAISE` sequence; this is the
trickiest part of codegen and is gated by a dedicated subset of the
CPython test suite (`test_compile`, `test_dis`).

### flowgraph.go

> Detailed source-of-truth: `1627_gopy_flowgraph.md`. The summary below
> lists the visible types and the optimisation set; per-pass file
> split, exact pass ordering, const-cache structure, super-instruction
> contract, optimizeLoadFast ref-stack panel, and the layered test
> plan live in 1627.

Port of `flowgraph.c`. Converts an instruction stream into a CFG of
basic blocks, runs peephole optimizations, then converts back.

```go
type Block struct {
    Instrs   []Instr
    Succ     []*Block
    Pred     []*Block
    StackEntry int32
    Reachable  bool
}

type CFG struct {
    Blocks []*Block
    Entry  *Block
}

func FromSequence(seq *Sequence) *CFG
func (g *CFG) Optimize()
func (g *CFG) ToSequence() *Sequence
func (g *CFG) ResolveJumps() // labels -> offsets
```

Optimizations (every one a CPython parity test target):

* Constant folding on `LOAD_CONST` chains.
* Jump threading (`JUMP -> JUMP -> X` becomes `JUMP X`).
* Conditional jump propagation (`POP_JUMP_IF_TRUE` over a block that
  ends in another conditional jump).
* Dead-code elimination (unreachable blocks and instructions after an
  unconditional terminator).
* Stack-effect verification.
* Push `RESUME` at function entry.
* Insert `LOAD_CONST None`, `RETURN_VALUE` for fall-through returns.

The order and exact predicates are taken from `flowgraph.c`. Lints
(`gocognit`, `gocyclo`) are exempted on the optimizer entry point
because the CPython source structure is the contract.

### assemble.go

> Detailed source-of-truth: `1628_gopy_assemble.md`. The summary below
> shows the public type and the four-step plan; the PEP 626 line-table
> format dispatcher, PEP 657 exception-table varint encoding, the
> localsplus / fastlocalskinds layout, and the marshal-parity test
> plan live in 1628.

Port of `assemble.c`. Lays out the final code object:

```go
type Assembler struct {
    Code      []byte
    LineTable []byte
    ExcTable  []byte
    Consts    []objects.Object
    Names     []string
}

func Assemble(scope *symtable.Entry, seq *Sequence, ff *future.Features) (*objects.Code, error)
```

Steps:

1. Lay out instructions in order, expanding wide args via
   `EXTENDED_ARG`.
2. Emit the line/column table in the adaptive varint format
   (PEP 626): short, one-line, long, none.
3. Emit the exception table (PEP 657): start, end, target,
   stack_depth, lasti as varint deltas.
4. Allocate constants, names, varnames, freevars, cellvars from the
   symtable.

The byte format is fixed by CPython 3.14 marshal, so the assembler is
testable against `dis.dis` output and against `marshal.loads` of a
CPython-produced .pyc.

### compiler.go

Top-level `compile.Compile`:

```go
func Compile(mod ast.Mod, filename string, optimize int) (*objects.Code, error)
```

Driver: `ast.Validate` -> `ast.Preprocess` -> `future.FromAST` ->
`symtable.Build` -> per-scope `codegen` -> `flowgraph.Optimize` ->
`assemble.Assemble`. Errors propagate as Go errors carrying the
SyntaxError string and position.

## Phasing inside v0.5

Because v0.5 is the largest single phase in the project, we split it
into commits sized to land independently. Each commit ships with
matched unit tests and is lint-clean. The cross-cut gate at
`v05test/gate_test.go` exercises the full pipeline and is the last to
land.

## Full v0.5 checklist

Status legend: `[x]` done, `[ ]` pending, `[~]` partial / scaffold.

### 1. ast package

#### asdl runtime

* [x] `ast/asdl.go`: `Seq[T]`, `Len`, `Get`, `Set`, `NewSeq`. Mirrors
  `_Py_asdl_*_seq_new` and `asdl_seq_LEN/GET/SET`.
* [x] `ast/asdl_test.go`: empty seq, get/set round-trip, nil Len.

#### Hand-written node skeleton (foundation for early ports)

* [x] `ast/nodes.go`: `Pos`, `NoPos`, sealed `Mod`/`Stmt`/`Expr`
  interfaces, `Module`, `Interactive`, `Expression`, `FunctionType`,
  `TypeIgnore`, `ImportFrom`, `Alias`, `ExprStmt`, `Constant`,
  `IsDocString` (mirrors `_PyAST_GetDocString`).

#### asdl-driven Go generator

* [x] `tools/asdl_go/`: parser for `cpython/Parser/Python.asdl`.
* [x] `tools/asdl_go/`: emitter for Go structs and sealed-interface
  marker methods. Collision rule renames `stmt.Expr` to `ExprStmt`
  and single-ctor sums (`type_ignore.TypeIgnore`) to `<name>Node`.
* [x] `tools/asdl_go/main.go`: CLI with `-input`, `-output`.
* [x] `tools/asdl_go/main_test.go`: small-fixture parse and emit
  shape tests plus a real-asdl smoke test.

#### Generated nodes (output of the generator)

* [x] `ast/nodes_gen.go`: full mod/stmt/expr/pattern/excepthandler
  sums, type_param sum (PEP 695), product types (`arguments`, `arg`,
  `keyword`, `alias`, `withitem`, `match_case`), expr_context and
  operator enums folded in (CPython collapses these into the same
  asdl file). Hand-written stubs in `nodes.go` retired; only `Pos`,
  `NoPos`, and `IsDocString` remain there.

#### Validation

* [x] `ast/validate.go` plus `ast/validate_panel.go`: `Validate(mod
  ast.Mod) error` mirroring `_PyAST_Validate`.
  * [x] Position sanity (lineno ordering, no negative offsets except
    NoPos sentinel).
  * [x] Forbidden identifier names (`None`, `True`, `False`) via
    `validateName`.
  * [x] `Constant.Value` constrained to marshal-allowed kinds.
  * [x] Comprehension non-empty generators (`validateComprehension`).
  * [x] expr_context consistency via `validateExprCtx`
    (Name/Attribute/Subscript/Starred/List/Tuple ctx slots).
  * [x] `Starred` allowed only as Call arg or iterable element
    (default `validateExpr` rejects; `Call` and `validateLoadElts`
    permit in Load context).
  * [x] `Match` pattern shape rules (`validatePattern`,
    `validatePatternSeq`).
  * [x] PEP 695 type-parameter constraints (`validateTypeParams`).
* [x] `ast/validate_test.go`: foundation panel (positions, ImportFrom
  level, constant kinds, nil rejection). Expands with later
  validators.

#### Preprocess

* [x] `ast/preprocess.go`: `Preprocess(mod, opts) []Warning`. Mirrors
  `_PyAST_Preprocess` for CPython 3.14: a single tree walk that does
  PEP 765 plus the small set of folds 3.14 keeps in the AST layer.
  * [x] Full `astfold` walker descending every stmt / expr / pattern
    / type_param kind (parity with `astfold_*` cases in `ast_preprocess.c`).
  * [x] PEP 765 finally-block control-flow checks (`return`, `break`,
    `continue`).
  * [x] `string % tuple` printf-format fold to `JoinedStr` (via
    `fold_binop`).
  * [x] `Name("__debug__")` substitution by Constant(!optimize) in
    Load context.
  * [x] MatchValue / MatchMapping const folding (USub of Constant,
    Add/Sub of all-Constant operands).
  * [x] Docstring removal under `-OO` (replaces sole-stmt docstring
    with `Pass` per `remove_docstring`).
  * [x] Body re-wrap when a fold produces a leading string expr.
  * [x] PEP 563 annotation skip: `Arg.annotation`,
    `FunctionDef.returns`, `AsyncFunctionDef.returns`,
    `AnnAssign.annotation` not visited when
    `CO_FUTURE_ANNOTATIONS` is set in the future bits.
  * [x] `syntax_check_only` flag suppresses mutating folds.
  * [n] Arithmetic / Compare / BoolOp / tuple / frozenset constant
    folding lives in `flowgraph.c` LOAD_CONST chain folding for 3.14;
    not part of `ast_preprocess.c`. See section 7 below.
* [x] `ast/preprocess_test.go`: PEP 765 panel plus fold panels for
  format folding, `__debug__`, MatchValue numerics, docstring removal,
  PEP 563 annotation skip, syntax-check-only mode.

#### Unparse

* [x] `ast/unparse.go`: `Unparse(expr Expr) (string, error)`.
  * [x] Operator precedence parenthesization parity.
  * [x] Float infinity rendered as `1e309`.
  * [x] f-string / t-string round-trip.
  * [x] `FormattedValue`, `Interpolation`, `JoinedStr`, `TemplateStr`.
* [x] `ast/unparse_test.go`: golden-string panel matching CPython's
  `ast.unparse` output.

### 2. future package

* [x] `future/future.go`: `Features`, CO_FUTURE_* numeric flags
  matching `Include/cpython/code.h`, `FromAST`, `SyntaxError` error
  type, "not a chance" string parity for `from __future__ import
  braces`, "future feature %.100s is not defined" parity.
* [x] `future/future_test.go`: annotations, barry_as_FLUFL, no-op
  features, braces rejection, unknown rejection, docstring skip,
  stop-at-first-non-future, relative-import ignored, location
  tracking, Expression mode.

### 3. compile package: instruction sequence

* [x] `compile/instrseq.go`: `Sequence`, `Instr`, `ExceptHandlerInfo`,
  `JumpTargetLabel`, `Opcode`, `MaxOpcode=511`, `MaxOparg=1<<30`,
  `NewLabel`, `UseLabel`, `Addop`, `Insert`, `AddNested`,
  `SetAnnotationsCode`, `ApplyLabelMap`. Takes `hasTarget` predicate
  callback so it does not depend on opcode metadata yet.
* [x] `compile/instrseq_test.go`: label IDs, addop+UseLabel, idempotent
  ApplyLabelMap, non-jump preservation, Insert label shift, AddNested,
  except-handler resolution, SetAnnotationsCode panic, opcode/oparg
  range panics.

### 4. opcode generator and metadata

* [x] `tools/opcodes_go/`: parser for `cpython/Lib/_opcode_metadata.py`
  and `Include/internal/pycore_opcode_metadata.h`.
* [x] `tools/opcodes_go/main.go`: CLI emitting `compile/opcodes_gen.go`.
  vm/opcodes_gen.go waits on the v0.6 vm port.
* [x] `compile/opcodes_gen.go`: typed `Opcode` constants for the full
  3.14 opmap; `HasArg`, `HasConst`, `HasName`, `HasJump`, `HasFree`,
  `HasLocal`, `HasEvalBreak`, `HasDeopt`, `HasError`, `HasEscapes`,
  `HasExit`, `HasPure`, `HasPassthrough`, `HasOpargAnd1`,
  `HasErrorNoPop`, `HasNoSaveIp` predicates; `Name()` lookup;
  `HasTarget` shorthand for the label resolver.
* [~] `compile/opcodes_metadata_gen.go`: per-opcode stack-effect
  table (push/pop counts) and cache-line sizes. v0.5 hand-codes the
  stack-effect table inside `flowgraph_stackdepth.go`; the generated
  file lands when the bytecodes.c DSL generator does in v0.6.
* [x] `compile/opcodes_test.go`: opcode number cross-check, flag
  predicates, HasTarget panel, out-of-range Name.
* [x] Wire `Sequence.ApplyLabelMap` callers to pass
  `compile.HasTarget`.

### 5. symtable package

#### Data model

* [x] `symtable/types.go`: `Block`, `Scope`, `SymbolFlags` constants
  byte-equal to CPython (`DEF_GLOBAL=1`, `DEF_LOCAL=2`, `DEF_PARAM=4`,
  `DEF_NONLOCAL=8`, `USE=0x10`, etc., per
  `Include/internal/pycore_symtable.h`). Includes `DEF_FREE_CLASS`,
  `DEF_IMPORT`, `DEF_ANNOT`, `DEF_COMP_ITER`, `DEF_TYPE_PARAM`,
  `DEF_COMP_CELL`, `SCOPE_OFFSET`, `ScopeMask`. Block enum order
  matches `_Py_block_ty`. `ComprehensionType` enum byte-equal.
* [x] `symtable/entry.go`: `Entry` struct mirroring
  `PySTEntryObject` field-for-field, including the PEP 649
  `AnnotationBlock`, PEP 695 `MangledNames`, `CanSeeClassScope`,
  `HasConditionalAnnotations`, `InConditionalBlock`,
  `InUnevaluatedAnnotation`, `CompIterExpr`, `Method`, `HasDocstring`.
  `Directive {Name, Loc}` records for global / nonlocal locations.
  Methods: `IsFunctionLike`, `GetSymbol`, `GetScope`.
* [x] `symtable/table.go`: `Table` struct (Top, Blocks, Future,
  Filename) with `Lookup(key any)`. Map key is `any` because no
  `ast.Node` interface exists.
* [x] `symtable/mangle.go`: `Mangle(private, name)` /
  `MaybeMangle(private, ste, name)` PEP 8 private name mangling.

#### Build pass (block visit)

* [x] `symtable/build.go`: `Build(mod ast.Mod, filename string, ff
  *future.Features) (*Table, error)`. Driver dispatches Module /
  Interactive / Expression bodies; `FunctionType` returns an explicit
  error to match CPython. `builder` carries `table`, `cur`, `stack`,
  `private`, `filename`, `future`, `nextID`. Helpers: `enterBlock`,
  `enterExisting`, `exitBlock`, `addDef`, `addDefCtx`,
  `addDefHelper`, `checkName`, `lookup`, `lookupEntry`,
  `recordDirective`, `allowsTopLevelAwait` (stub returns false),
  `isAsyncDef`.
* [x] `symtable/build_visit.go`: `visitStmt` plus per-kind helpers
  (`visitFunctionLike`, `visitClassDef`, `visitTypeAlias`,
  `visitReturn`, `visitAssign`, `visitAnnAssign`, `visitForLike`,
  `visitWhile`, `visitIf`, `visitMatch`, `visitRaise`, `visitTryLike`,
  `visitAssert`, `visitWithLike`, `visitGlobal`, `visitNonlocal`,
  `checkImportFrom`, `maybeSetCoroutineForModule`). Split into
  `visitStmtDef` / `visitStmtControl` / `visitStmtSimple` to satisfy
  gocyclo. `funcLike` struct + `unpackFuncLike` factor the
  FunctionDef / AsyncFunctionDef variants. `annotationKey {parent,
  index}` distinguishes adjacent annotations.
* [x] `symtable/build_expr.go`: `visitExpr` split into
  `visitExprComp` / `visitExprUnary` / `visitExprLeaf`. Per-kind
  `visitName`, `visitYield`, `visitAwait`, `visitCall`, `visitLambda`,
  `raiseIfAnnotationBlock`, `raiseIfComprehensionBlock`,
  `visitFormattedValue`, `visitSliceParts`. `visitName` recognises
  `super` and pulls `__class__` into scope.
* [x] `symtable/build_helpers.go`: `visitArguments`, `visitParams`,
  `visitArgAnnotations`, `visitAnnotations`, `visitAnnotation`,
  `visitAlias`, `visitExceptHandler`, `visitWithItem`,
  `visitMatchCase`, `visitPattern`, `visitPatternSeq`,
  `checkKwdPatterns`, `visitKeyword`, `visitTypeParam`,
  `visitTypeParamSubexpr`, `enterTypeParamBlock`. Synthetic
  `.type_params`, `.generic_base`, `.defaults`, `.kwdefaults` slots
  match CPython.
* [x] `symtable/build_comp.go`: `handleNamedExpr` walrus retarget,
  `extendNamedExprScope` walking the stack reverse with the class /
  type-block rejections, `handleComprehension` /
  `runComprehensionBody` / `comprehensionTypeFor` for list / set /
  dict / generator forms, `visitComprehensionIter`, `implicitArg`
  for the `.0` parameter.

#### Analysis pass

* [x] `symtable/analyze.go`: top-level `analyze` plus `analyzeBlock`,
  `analyzeChildBlock`, `analyzeName`, `analyzeCells`, `dropClassFree`,
  `updateSymbols`, `inlineComprehension`, `isFreeInAnyChild`,
  `errorAtDirective`. Splits the original 200-line `analyze_block`
  into `prepareClassPreambleSets`, `finalizeChildSets`,
  `analyzeChildren`, `pickClassEntry`, `spliceInlinedChildren` to
  satisfy gocognit. `nameSet` is a small `map[string]struct{}`
  helper that mirrors PySet operations (add / discard / contains /
  clone / union).

#### Errors

* [x] `symtable/errors.go`: `SyntaxError {Msg, Filename, Pos}`
  implementing the `error` interface. CPython error strings preserved
  verbatim in package-level constants:
  * [x] `msgGlobalParam`, `msgNonlocalParam`,
    `msgGlobalAfterAssign`, `msgNonlocalAfterAss`,
    `msgGlobalAfterUse`, `msgNonlocalAfterUse`, `msgGlobalAnnot`,
    `msgNonlocalAnnot`, `msgImportStar`.
  * [x] `msgNamedExprComp`, `msgNamedExprBound`,
    `msgNamedExprAlias`, `msgNamedExprParam`, `msgNamedExprConflict`,
    `msgNamedExprInner`, `msgNamedExprIterExpr`.
  * [x] `msgAnnotationNotAllow`, `msgNotAllowedInTypVar`,
    `msgNotAllowedInAlias`, `msgNotAllowedInParams`,
    `msgDupTypeParam`, `msgDupArgument`.
  * [x] `msgAsyncWithOutside`, `msgAsyncForOutside`,
    `msgAsyncCompOutside`, `msgAwaitOutsideFunc`,
    `msgAwaitOutsideAsync`.
  * [x] `msgAssignDebug`, `msgDeleteDebug`, `msgFutureLate`.
  * [x] `msgNonlocalAtModule`, `msgNoBindingNonlocal`,
    `msgNonlocalGlobal`, `msgNonlocalTypeParam`.
  * [x] `errorf(filename, loc, format, args...)` constructor.

#### Tests

* [x] `symtable/symtable_test.go`: module / function / class / nested
  closure / global directive / nonlocal-at-module / nonlocal-no-binding
  / global-after-use / `__debug__` rejection / class method flag /
  private-name mangling in class / list comprehension / import binding
  / star-import outside module / duplicate argument / SyntaxError
  shape.
* [x] `symtable/mangle_test.go`: full panel for `Mangle` and
  `MaybeMangle` allow-list.

### 6. compile package: codegen

> Detailed source-of-truth for this section: `1626_gopy_codegen.md`.
> The checklist below is the cross-cutting view. The per-visitor file
> split, function citations, fblock-stack types, super-instruction
> contract, with-statement state machine, deferred-annotation panel,
> PEP 695 type-parameter codegen, and the layered test plan all live
> in 1626. Tick a box here only after the matching detailed checklist
> in 1626 is also ticked.

#### Driver

* [x] `compile/codegen.go`: `(*Compiler).Codegen(scope *symtable.Entry,
  mod ast.Mod) (*Unit, error)`.
* [x] `compile/codegen.go`: `Compiler` struct (current scope stack,
  unit stack, future flags, filename, source, optimize level,
  symtable, const cache).
* [x] `compile/codegen_addop.go`: addOp / addOpI / addOpJump /
  loadConst / addOpName / useLabel helpers as methods on `*Compiler`.
  Macros land as Go methods rather than a separate macros.go.

#### Statement visitors

* [~] FunctionDef / AsyncFunctionDef (decorators, posonly, kwonly,
  defaults, kwonly defaults, varargs, varkw, closure cells landed.
  PEP 695 type parameters and PEP 649 deferred annotations are
  separate panels later in 1626).
* [x] ClassDef (basic shape: bases, keyword args, decorators).
  Type parameters / __classcell__ / static-attributes panels land
  alongside PEP 695 + super().
* [x] Return, Delete, Assign, AugAssign, AnnAssign (full target panel
  including attr / subscript / tuple / star unpack).
* [x] For / AsyncFor (with break/continue unwind).
* [x] While.
* [x] If (skeleton: constant-condition specialization pending).
* [x] With / AsyncWith (with-statement state machine).
* [x] Match / MatchValue / MatchSingleton / MatchSequence /
  MatchMapping / MatchClass / MatchStar / MatchAs / MatchOr.
* [x] Raise (with `from`).
* [x] Try / TryStar (PEP 654).
* [x] Assert (with `__debug__` short-circuit).
* [x] Import / ImportFrom (with star).
* [x] Global / Nonlocal (no-op at codegen, already in symtable).
* [x] Expr (top-level expression: CALL_INTRINSIC_1 INTRINSIC_PRINT in interactive mode).
* [x] Pass / Break / Continue (loop fblock walk; full unwind
  through with / try lands alongside those visitors).
* [x] Delete / AugAssign / AnnAssign.
* [x] TypeAlias (PEP 695, lowered as
  `LOAD_CONST <name>, LOAD_CONST None, <value>, CALL_INTRINSIC_1
  INTRINSIC_TYPEALIAS=12, STORE_NAME`).

#### Expression visitors

* [x] BoolOp (short-circuit jumps).
* [x] NamedExpr (walrus).
* [x] BinOp.
* [x] UnaryOp.
* [x] Lambda.
* [x] IfExp.
* [x] Dict, Set, List, Tuple displays.
* [x] ListComp, SetComp, DictComp, GeneratorExp.
* [x] Await, Yield, YieldFrom.
* [x] Compare (chained compares).
* [x] Call (with star/starstar args, keyword args).
* [x] FormattedValue, JoinedStr.
* [~] Interpolation, TemplateStr (PEP 750 t-strings; deferred).
* [x] Constant (LOAD_CONST + co_consts allocation).
* [x] Attribute (LOAD_ATTR + super-instruction LOAD_SUPER_ATTR).
* [x] Subscript (LOAD/STORE/DELETE).
* [x] Starred (in target / arg positions).
* [x] Name (LOAD_FAST / LOAD_DEREF / LOAD_GLOBAL / LOAD_NAME by
  scope).
* [x] Slice.

#### Block / unwind machinery

* [x] Frame block stack (`pushFblock`, `popFblock`, unwind helpers in
  `codegen_fblock.go`).
* [x] Try/except/finally exception table emission.
* [x] With unwinding (call __exit__ on path).
* [x] Async-with unwinding (await __aexit__).
* [~] Generator return-value handling (RETURN_GENERATOR + RESUME
  prologue landed; full StopIteration packaging refined in v0.6 vm).
* [~] Coroutine close handling (CoCoroutine flag set; close protocol
  landed in v0.6 vm).

#### Tests

* [x] `compile/codegen_test.go`: per-statement smoke tests against a
  parser-stub feeding hand-built ASTs.
* [x] Comprehension scope test (free var capture).
* [x] Match-statement panel (one test per pattern kind).
* [x] TypeAlias codegen test pinned via the `type_alias` golden
  (`v05test/testdata/golden/type_alias.golden`).
* [~] PEP 695 type-parameter codegen test for TypeVar / ParamSpec /
  TypeVarTuple bodies. Validator path covered by
  `validateTypeParams`; full codegen golden lands alongside the
  generic-class panel in v0.6.

### 7. compile package: flowgraph

> Detailed source-of-truth for this section: `1627_gopy_flowgraph.md`.
> The checklist below is the cross-cutting view. The per-pass file
> split, exact pass ordering inside `OptimizeCodeUnit`, const-cache
> structure, optimizeLoadFast ref-stack panel, and the layered test
> plan live in 1627.

* [~] `compile/flowgraph.go`:
  * [x] `BasicBlock` struct (Instrs, Next, Label, StartDepth,
    Predecessors, Visited, Cold, Warm, Reachable).
  * [x] `Builder` struct (Head, Tail, labelMap).
  * [x] `FromSequence(*Sequence) (*Builder, error)`.
  * [~] `Optimize(*Sequence, *[]any, nlocals, firstLineno) (*Info, error)`:
    runs the v0.5 subset of `_PyCfg_OptimizeCodeUnit` against the flat
    sequence. CFG-driven passes are queued for the follow-on.
  * [x] `(*Builder).ToSequence() (*Sequence, error)`.
  * [x] Jump-label resolution via `Sequence.ApplyLabelMap(HasTarget)`.

#### Optimization passes (each is a parity target)

* [x] LOAD_CONST chain folding (multi-step). `Optimize` now drives
  `foldBinaryIntConst` plus `eliminateDeadCodeAfterTerminator` to a
  fixed point so a fold that exposes a new triple gets caught in the
  same pass.
* [x] Int-int `BINARY_OP` folding (`foldBinaryIntConst`).
* [x] Jump threading (`JUMP -> JUMP -> X`) via `threadJumps`.
* [x] Conditional-jump propagation
  (`POP_JUMP_IF_TRUE` over a tail-conditional block) via
  `propagateConditionalJumps`.
* [x] Unreachable block elimination via `removeUnreachableBlocks`
  (DFS reachability with handler labels pinned as roots; length
  preserving so the label map and jump opargs stay valid).
* [x] Dead-code elimination after unconditional terminators.
* [~] Stack-effect verification (forward linear scan via
  `calculateStackdepth`; CFG-based variant pending).
* [x] `RESUME` insertion at entry (codegen prologue).
* [x] Implicit `LOAD_CONST None` / `RETURN_VALUE` for fall-through
  function returns.
* [x] EXTENDED_ARG insertion for wide opargs (assemble-time).
* [ ] Push-null fix-up for super-instruction call sites.
* [x] Redundant-NOP compaction (`removeRedundantNops`).

#### Tests

* [x] `compile/flowgraph_test.go` and `flowgraph_passes_test.go`:
  hand-built sequences exercising fold / dead-code / NOP compact /
  label resolution / stack depth.

### 8. compile package: assemble

> Detailed source-of-truth for this section: `1628_gopy_assemble.md`.
> The checklist below is the cross-cutting view. The PEP 626
> line-table dispatcher, PEP 657 exception-table varint encoding, the
> co_localsplus / fastlocalskinds layout, and the marshal-parity test
> plan live in 1628.

* [x] `compile/assemble.go`:
  * [x] Assembler internals (Code, LineTable, ExceptionTable, Consts,
    Names, Varnames, Freevars, Cellvars).
  * [x] `Assemble(seq *Sequence, info *Info, unit *Unit, filename string) (*Code, error)`.
* [x] EXTENDED_ARG wide-oparg expansion.
* [x] PEP 626 location table (short / one-line / long / no-location /
  no-column varint forms in `assemble_locations.go`).
* [x] PEP 657 exception table (start, end, target, depth, lasti as
  6-bit varint deltas in `assemble_exceptions.go`).
* [x] Constants table allocation (de-dup by `(typeTag, value)` with
  float bit-pattern keying for NaN-safety in `codegen_addop.go`).
* [x] Names / Varnames / Freevars / Cellvars population from
  symtable.
* [x] `co_flags` assembly (CoOptimized, CoNewLocals, CoVarargs,
  CoVarkeywords, CoNested, CoGenerator, CoNoFree, CoCoroutine,
  CoMethod). CoIterableCoroutine and CoAsyncGenerator land alongside
  the generator/coroutine state machines.
* [x] `co_qualname` build-up via `buildQualname` walking the unit
  stack (top-level / class parent / function `<locals>` parent).
* [x] `co_code` byte emission.
* [x] `compile/assemble_test.go` plus `assemble_locations_test.go`,
  `assemble_exceptions_test.go`, `assemble_flags_test.go`:
  * [x] EXTENDED_ARG widening.
  * [x] Location table varint round-trip.
  * [x] Exception table varint round-trip.
  * [x] Const dedup parity with CPython (type-keyed).

### 9. compile package: driver

* [x] `compile/compiler.go`:
  * [x] `Compile(mod ast.Mod, filename string, optimize int) (*Code, error)`.
  * [x] Pipeline order: future.FromAST -> symtable.Build -> Codegen
    (per-scope) -> Optimize -> Assemble. Validate / Preprocess wire
    in alongside the parser handover; the v0.5 entry is parser-stub
    fed.
  * [~] `optimize` levels -1, 0, 1, 2 mirror CPython (level threaded
    through; level-2 docstring removal lands in preprocess; full
    `-O` panel beyond docstring/assert lands in v0.6).
* [x] `compile/compiler_test.go` + the `v05test` gate package:
  end-to-end on hand-built ASTs.

### 10. tokenize package skeleton (1665)

* [x] `tokenize/types.go`: hand-written `Type` declaration plus
  `String()` lookup. The numeric constants live in the generated file.
* [x] `tools/tokens_go/`: generator from `Grammar/Tokens` plus
  `Include/internal/pycore_token.h` plus `Lib/token.py`.
* [x] `tokenize/types_gen.go`: generator output. 69 token kinds
  (ENDMARKER=0..ENCODING=68) plus `tokenNames` table.
* [~] `tokenize/tokenize.go`: skeleton (`Iter`, `Token`, `New`,
  `NewReadline`, `Next`) lands with the v0.9 lexer port; v0.5 ships
  the type table only since the gate uses hand-built ASTs.
* [x] `tokenize/tokenize_test.go`: type numeric pinning plus
  `Type.String` lookup. Iterator contract tests land in v0.9.

### 11. dis-equivalent disassembler (gate support)

* [x] `compile/dis.go`: `Disassemble(co *Code) string` rendering
  bytecode plus recursive headers for nested code objects. Used by
  the v05test gate for structural assertions.
* [x] `compile/dis_test.go`: opcode rendering, oparg display,
  EXTENDED_ARG recombination, nested-code header. Byte-equal
  comparison against a CPython golden capture lands with the marshal
  package.

### 12. v05test cross-cut gate

* [~] `v05test/gate_test.go`: structural panel landed; the byte-equal
  marshal-roundtrip variant waits on the marshal package and golden
  corpus.
  * [x] `TestGateEmptyModule`: empty module returns `None`.
  * [x] `TestGateSimpleAssign`: `x = 1`.
  * [x] `TestGateBinaryAdd`: `a = 1 + 2` (asserts the int-int fold).
  * [x] `TestGateLoadAfterStore`: `x = 1; x`.
  * [x] `TestGateIfWhile`: `if`/`while` panel (asserts
    POP_JUMP_IF_FALSE; the JUMP back-edge is asserted once pseudo-op
    lowering lands).
  * [~] `TestGateTryExcept`: wired but `t.Skip`'d pending CFG-based
    stack-depth (handler entry seeding).
  * [x] `TestGateDef`: `def f(x): return x + 1`.
  * [~] `TestGateComprehension`: wired but `t.Skip`'d pending CFG
    back-edge stack-depth.
  * [x] `TestGateAsyncFunction`: `async def f(): pass` (asserts
    CoCoroutine on the inner code object).
  * [n] `TestGateMarshalRoundtrip`: marshal-byte parity against
    CPython is deferred to v0.8 with the import system. Until then
    the disassembly-text golden corpus (1629) is the gate. The v0.8
    follow-on adds a code-object marshal arm and a
    byte-for-byte panel against host CPython.
* [x] `v05test/testdata/golden/`: ten checked-in `.golden`
  disassembly snapshots (`empty_module`, `simple_assign`,
  `binary_add`, `load_after_store`, `if_pass`, `while_pass`,
  `def_add_one`, `async_def_pass`, `class_pass`, `type_alias`).
  Refresh contract via `go test ./v05test/ -update -run TestGolden`;
  see 1629 for the corpus rules.

### 13. Release plumbing for v0.5.0

* [x] CHANGELOG: v0.5.0 entry.
* [x] `changelog/v0.5.0.md`: full release notes.
* [x] `build/version.go`: bump to `0.5.0` for the release commit.
* [x] PR with all-green CI (lint + test on macOS, Linux, Windows).
* [ ] Tag `v0.5.0` and create GitHub release. Pending explicit
  release go-ahead.
* [ ] Bump `main` to `0.6.0-dev` post-release.

### 14. Docs and side artifacts

* [x] Update `1602_gopy_filemap.md` with the v0.5 entries.
* [x] Update `1603_gopy_roadmap.md` to reflect what landed for v0.5:
  CFG-driven optimisation passes, validate panel, TypeAlias codegen,
  disassembly golden corpus (1629). The "shipped" marker flips with
  the v0.5.0 tag.
* [n] Cross-reference `1690_gopy_quirks.md` when codegen or assemble
  decisions diverge from a literal C port. 1690 is reserved (v0.5
  port has no codegen / assemble divergences worth recording).

## Working notes (carry forward)

* The `ast/nodes.go` hand-written file is a temporary scaffold so that
  `future` and other early-v0.5 ports compile. It must shrink once
  `nodes_gen.go` lands; only `Pos`, `NoPos`, and `IsDocString` should
  remain in non-generated files.
* `compile.Sequence.ApplyLabelMap` takes a `hasTarget` predicate
  callback to keep `instrseq.go` independent of the opcode metadata.
  Once `compile/opcodes_gen.go` lands, callers should pass
  `compile.HasTarget` directly (a generated helper), and a thin
  wrapper `(*Sequence).ApplyLabels()` may be added for ergonomics.
* The asdl generator and the opcode generator share style: both read a
  CPython-source-tree input file at `go generate` time and emit a
  checked-in `_gen.go`. Generators live under `tools/` and are not
  compiled into the runtime binary.
* The dis-equivalent in `compile/dis.go` is a tool, not part of the
  runtime API surface; it lives in the `compile` package only because
  it needs the opcode metadata. The Python-visible `dis` module port
  comes much later (stdlib effort).

## Gate

```go
src := "a = 1 + 2"
code := compile.Compile(parser.Parse(src), "<gate>", 0)
got := dis.Format(code)
want := "<output captured from CPython 3.14 dis.dis>"
if got != want { t.Fatal(got) }
```

Plus byte-equal checks on `co_code`, `co_linetable`, and
`co_exceptiontable` for a small panel of programs (assignment, if,
while, try/except, def, comprehension, async function).

## Out of scope for v0.5

* PEG parser (`cpython/Parser/`). Separate spec series.
* Tier-2 optimizer (`optimizer*.c`). Lands in v0.12.
* Specialization (`specialize.c`). Lands in v0.11.
* Instrumentation hooks. Lands in v0.11.
* JIT (`jit.c`). Indefinitely deferred.

