# gopy bytecodes DSL

# 1621. Bytecodes DSL and Go-emitting generator

## What we are porting

CPython's bytecode interpreter is generated. The source of truth
is `Python/bytecodes.c`, a C file written in a small DSL. The
upstream generator lives in `Tools/cases_generator/` (Python),
walks the DSL, and emits five C headers:

| Generated file                              | Consumer                  |
|---------------------------------------------|---------------------------|
| `Python/generated_cases.c.h`                | `Python/ceval.c`          |
| `Python/executor_cases.c.h`                 | `Python/optimizer.c` (Tier-2) |
| `Python/optimizer_cases.c.h`                | `Python/optimizer.c` (analysis) |
| `Include/internal/pycore_opcode_metadata.h` | shared metadata           |
| `Include/internal/pycore_uop_metadata.h`    | Tier-2 micro-op metadata  |

For v0.6 we need just two: the Tier-1 dispatch handlers and the
shared metadata. The Tier-2 outputs ship in v0.12.

## Strategy

Same shape as 1642 (parser_gen): hand-port the DSL parser to Go
and write a Go-emitting backend. We do **not** wrap the upstream
Python generator. Two reasons:

1. The output we want (Go switch arms calling typed object
   helpers) is structurally different from the C output (computed
   gotos with macro-expanded stack ops). A backend swap is more
   surgical than wrapping plus translating.
2. CPython updates `bytecodes.c` nearly every release. Owning the
   parser end-to-end means a CPython rebase is one regeneration
   plus a drift check, not a transitive Python toolchain dep.

The DSL itself is small, well-documented in
`Tools/cases_generator/parsing.py`, and stable across recent
releases.

## DSL surface

A `bytecodes.c` entry looks like:

```c
inst(BINARY_OP_ADD_INT, (left, right -- res)) {
    DEOPT_IF(!PyLong_CheckExact(left));
    DEOPT_IF(!PyLong_CheckExact(right));
    STAT_INC(BINARY_OP, hit);
    res = _PyLong_Add((PyLongObject *)left, (PyLongObject *)right);
    DECREF_INPUTS();
    ERROR_IF(res == NULL, error);
}
```

Key forms:

* `inst(NAME, (inputs -- outputs)) { body }`: a real instruction.
* `op(NAME, (inputs -- outputs)) { body }`: a fragment, composed
  by `macro` into instructions.
* `macro(NAME) = OP1 + OP2;`: composition.
* `pseudo(NAME, ...)`: lowered before assembly; never executes.
* `family(NAME, COUNTER) = { BASE, ADAPTIVE_1, ADAPTIVE_2 };`:
  specialization grouping.

Body uses control macros: `DEOPT_IF`, `ERROR_IF`, `EXIT_IF`,
`DECREF_INPUTS`, `INPUTS_DEAD`, `GOTO_ERROR`, `JUMPBY`,
`STACK_GROW`, `STACK_SHRINK`. The generator translates these to
target-language equivalents.

Stack effects are declared in the signature: `(left, right -- res)`
means "pop two, push one". The generator computes `n_pushed` and
`n_popped` from the signature alone.

## Go translation strategy

The generated Go file has one switch arm per real instruction:

```go
// generated by tools/bytecodes_gen; DO NOT EDIT
// bytecodes-sha256: <hash of bytecodes.c at generation time>

func (e *evalState) dispatch(op opcode.Op, oparg uint32) (next int, err error) {
    switch op {
    // ...
    case opcode.BINARY_OP_ADD_INT:
        left := e.peek(2)
        right := e.peek(1)
        if !object.LongCheckExact(left) {
            return e.deoptHere()
        }
        if !object.LongCheckExact(right) {
            return e.deoptHere()
        }
        res, err := object.LongAdd(left.(*object.Long), right.(*object.Long))
        e.decrefInputs(2)
        if err != nil {
            return 0, err
        }
        e.replace(2, res)
        return e.advance(1), nil
    // ...
    }
}
```

The control macros translate as follows:

| C macro                | Go target                                         |
|------------------------|---------------------------------------------------|
| `DEOPT_IF(cond)`       | `if cond { return e.deoptHere() }`                |
| `ERROR_IF(cond, lbl)`  | `if cond { return 0, e.error(lbl) }`              |
| `EXIT_IF(cond)`        | `if cond { return e.exitTrace() }` (Tier-2 only)  |
| `DECREF_INPUTS()`      | `e.decrefInputs(n_popped)`                        |
| `INPUTS_DEAD()`        | (no-op in refcount-only path)                     |
| `GOTO_ERROR(lbl)`      | `return 0, e.error(lbl)`                          |
| `JUMPBY(n)`            | `return e.advance(int(n)), nil`                   |
| `STACK_GROW(n)`        | `e.grow(n)`                                       |
| `STACK_SHRINK(n)`      | `e.shrink(n)`                                     |
| `INSTRUCTION_SIZE`     | constant, computed from oparg width plus inline cache |

The translator is opportunistic, mirroring the parser_gen action
translator (1642). Anything it cannot type lands as a panic-stub
arm so the generated file always compiles, and gets filled in as
the helper surface (`object/*`) gains the typed methods the
translator needs.

## Generator pipeline

Five milestones, mirroring 1642:

* **B1** DSL lexer and parser. Tokenize `bytecodes.c`, produce a
  typed AST of `inst` / `op` / `macro` / `family` / `pseudo`.
* **B2** Stack-effect analysis. Walk the signature, infer
  `n_popped`, `n_pushed`, named bindings.
* **B3** Per-instruction emitter. One switch arm per `inst`,
  oparg decode, stack push/pop, body translation.
* **B4** Macro expansion. Inline `op` fragments into their
  composing `macro` declaration before emitting.
* **B5** Specialization family wiring. Adaptive variants in the
  same family fall back to the base instruction in v0.6 (the
  specializer ships in v0.11).
* **B6** Action body translator. Same opportunistic shape as the
  parser_gen action translator: identifier-bound idents pass
  through, `_Py*` calls map to typed object helpers, anything
  with member access or unknown identifiers falls back to a
  panic-stub arm.
* **B7** Metadata emitter. Stack effects, oparg widths, cache
  layout, instruction names lifted to `compile/opcodes_gen.go`
  for the assembler.
* **B8** Drift check. SHA256 of `bytecodes.c` recorded in the
  generated preamble; `bytecodes_gen -check-drift` fails CI when
  the recorded hash does not match the current source.

## File mapping

| C / DSL source                        | Go target                                  |
|---------------------------------------|--------------------------------------------|
| `Python/bytecodes.c`                  | (input)                                    |
| `Python/generated_cases.c.h`          | `vm/opcodes_gen.go` (generated)            |
| `Python/opcode_targets.h`             | `vm/opcode_targets_gen.go` (generated)     |
| `Include/internal/pycore_opcode_metadata.h` | `compile/opcodes_gen.go` (generated)  |
| `Tools/cases_generator/parsing.py`    | `tools/bytecodes_gen/dsl_parser.go`        |
| `Tools/cases_generator/analysis.py`   | `tools/bytecodes_gen/analyze.go`           |
| `Tools/cases_generator/tier1_generator.py` | `tools/bytecodes_gen/emit_tier1.go`   |
| `Tools/cases_generator/generators_common.py` | `tools/bytecodes_gen/emit_common.go` |
| `Tools/cases_generator/stack.py`      | `tools/bytecodes_gen/stack.go`             |

## Checklist

Status legend: `[x]` shipped, `[ ]` pending, `[~]` partial / scaffold,
`[n]` deferred / not in scope this phase.

### Files

* [x] `tools/bytecodes_gen/main.go`: CLI with `-emit-tier1`,
  `-emit-metadata`, `-check-drift` flags.
* [x] `tools/bytecodes_gen/dsl_tok.go`: tokenizer for the DSL
  subset (C tokens plus the `--` stack-effect separator).
* [x] `tools/bytecodes_gen/dsl_parser.go`: parser producing a
  typed AST of `Inst`, `Op`, `Macro`, `Family`, `Pseudo`.
* [x] `tools/bytecodes_gen/analyze.go`: stack-effect analysis,
  binding scope, macro expansion order.
* [x] `tools/bytecodes_gen/stack.go`: push/pop sequence builder.
  Implemented as part of `analyze.go` since the binding view and the
  push/pop sequence share the same walk; no separate stack.go file.
* [n] `tools/bytecodes_gen/emit_common.go`: collapsed into
  `emit_tier1.go` and `emit_metadata.go`. The two emitters don't
  share enough surface to warrant a third file in v0.6; revisit if
  the metadata emitter grows custom oparg shapes.
* [~] `tools/bytecodes_gen/emit_tier1.go`: Tier-1 switch-arm emitter.
  Skeleton only: each arm pops inputs into named locals and emits a
  panic-stub body until B6 fills it in.
* [x] `tools/bytecodes_gen/emit_metadata.go`: stack-effect /
  cache-size / has-oparg / family tables. Skips `op` fragments;
  variadic stack slots emit as count = -1 ("compute at runtime")
  to mirror CPython. Round-tripped in `emit_metadata_test.go`.
* [~] `tools/bytecodes_gen/action.go`: C body to Go expression
  translator; opportunistic, falls back to panic-stub. Today
  understands the control-macro panel (DEOPT_IF, ERROR_IF, EXIT_IF,
  DECREF_INPUTS, INPUTS_DEAD, STAT_INC/DEC); _Py* helper calls and
  member-access shapes still bail to the panic-stub.
* [x] `tools/bytecodes_gen/drift.go`: SHA256 record / check
  (`HashFile`, `MarkerLine`, `ExtractMarker`, `CheckDrift`).
  Round-tripped in `drift_test.go`.

### Generator output panel

* [~] `vm/opcodes_gen.go`: switch dispatch over every Tier-1
  opcode in `bytecodes.c`. Adaptive variants reduce to their
  base case for v0.6. Generated end-to-end against cpython-314;
  arm bodies are panic-stubs pending B6 expansion of the action
  translator.
* [n] `vm/opcode_targets_gen.go`: opcode kind table. The Tier-1
  loop classifies via `compile.Opcode` directly; no separate
  targets table needed until the specializer in v0.11.
* [x] `compile/opcodes_gen.go`: opcode constants, mnemonic
  table, oparg widths. Generated and consumed by the v0.5
  assembler.

### Surface guarantees

* [x] Generator round-trips against the upstream
  `Python/bytecodes.c` for 3.14.0. Pinned by the SHA256 in the
  generated preamble (driven by `drift.go`).
* [x] Each `inst` body emits a switch arm with bound stack inputs,
  translated control macros, and either a typed action or a
  panic-stub fallback (B6 fills more arms as it grows).
* [x] Adaptive variants (`*_INT`, `*_STR`, `*_INSTANCE_VALUE`, ...)
  compile via the FamilyMap reduction to their base case for
  v0.6. The specializer (v0.11) is what makes the adaptive paths
  actually fire.
* [x] Metadata table matches CPython for opcode number, name,
  oparg width. Numeric values pinned by `compile/opcodes_gen.go`
  (generated against `_opcode_metadata.py`). Push / pop counts
  emitted as `MetadataEntry.Pushes` / `MetadataEntry.Pops` per
  instruction; round-tripped in `emit_metadata_test.go`.
* [x] Cache layout sizes (`CacheSize` field in `MetadataEntry`)
  emit per instruction, including macro-expanded specializable
  opcodes (BINARY_OP, CALL, LOAD_ATTR, ...). Pinned byte-for-byte
  against `Include/internal/pycore_code.h` cache structs by
  `tools/bytecodes_gen/cache_layout_test.go` (skips when the
  `CPYTHON` env var is unset).
* [x] Drift check: `bytecodes_gen -check-drift` fails when the
  recorded `bytecodes-sha256` does not match the current source.
  Pinned by `tools/bytecodes_gen/drift_test.go`.

### Action translator panel

* [ ] `_Py*_Check`, `_Py*_CheckExact` predicate calls. Bail to
  panic-stub today.
* [ ] `_Py*_Add`, `_Py*_Subtract`, ... numeric helpers. The
  hand-written panel in `vm/eval_simple.go` covers the v0.6
  arithmetic surface; the translator still bails on these.
* [x] `STAT_INC`, `STAT_DEC` translate to no-op. Pinned by the
  control-macro panel in `tools/bytecodes_gen/action.go`.
* [ ] `Py_INCREF`, `Py_DECREF`, `Py_NewRef`, `Py_XDECREF`
  translate to `e.incref` / `e.decref` / `e.newref`. Refcount
  ops are no-ops on the GIL build (Go's GC owns lifetime); they
  stay structural so the panel is readable against the C side.
* [ ] `Py_TYPE`, `Py_SIZE` direct field access.
* [ ] Member-access expressions (`obj->something`) bail to
  panic-stub. Fill in lazily as the typed object surface lands.

### Out of scope for v0.6

* `vm/executor_gen.go` (Tier-2 micro-op cases). Lands in v0.12.
* `optimizer/cases_gen.go` (Tier-2 abstract-interp cases). Lands
  in v0.12.
* `vm/uop_metadata_gen.go`. Lands in v0.12.

### Cross-references

* Eval loop that consumes the dispatch table: 1636.
* Frame layout the dispatch table reads: 1637.
* Tagged stack values: 1638.
* Assembler that consumes the metadata table: 1628.

