# 89. Making a CPython Patch

# 89. Making a CPython Patch

A CPython patch is a coordinated change to the interpreter, runtime, standard library, tests, documentation, and development metadata. Even a small fix may affect object lifetime, bytecode behavior, platform compatibility, import state, or public APIs.

Making a good CPython patch requires more than changing code. The patch must preserve runtime invariants, include regression tests, follow repository conventions, build correctly across platforms, and explain user-visible behavior clearly.

## 89.1 Understand the Existing Behavior First

Before changing code, reproduce and understand the current behavior.

For a bug:

```text id="rz2q5m"
reproduce the failure
reduce to a minimal case
identify expected behavior
find where the incorrect behavior begins
```

For a feature:

```text id="jlwmv0"
understand current semantics
read related implementation paths
inspect tests
inspect documentation
identify compatibility constraints
```

A patch created before understanding the surrounding subsystem usually introduces secondary bugs.

## 89.2 Build CPython Locally

Always work from a local build.

Typical setup:

```bash id="xgxtx6"
git clone https://github.com/python/cpython.git
cd cpython

./configure --with-pydebug
make -j8
```

Run:

```bash id="krv8rw"
./python
```

Do not use the system Python while modifying CPython internals.

The local interpreter ensures:

```text id="u1ljg8"
correct runtime
correct stdlib
correct extension modules
correct ABI
correct bytecode
```

## 89.3 Create a Branch

Use a dedicated branch.

Example:

```bash id="n0xxo6"
git checkout -b fix-dict-resize
```

Good branch names are short and descriptive.

Examples:

```text id="wmik5g"
fix-gc-cycle-crash
optimize-vectorcall-path
improve-import-error-message
add-test-for-recursion-limit
```

Avoid generic names such as:

```text id="52k7jw"
patch
changes
work
update
```

## 89.4 Find the Relevant Code

CPython is large. Start by locating the relevant subsystem.

Useful directories:

| Directory | Purpose |
|---|---|
| `Objects/` | Built-in object implementations |
| `Python/` | Interpreter and compiler |
| `Parser/` | Parsing |
| `Modules/` | Built-in and extension modules |
| `Lib/` | Standard library |
| `Include/` | Public and internal headers |
| `Doc/` | Documentation |
| `Lib/test/` | Tests |

Examples:

| Problem | Likely area |
|---|---|
| Dict behavior | `Objects/dictobject.c` |
| List behavior | `Objects/listobject.c` |
| Bytecode execution | `Python/ceval.c` |
| Compiler behavior | `Python/compile.c` |
| Import logic | `Python/import.c`, `Lib/importlib/` |
| GC behavior | `Modules/gcmodule.c` |
| Unicode internals | `Objects/unicodeobject.c` |

Use:

```bash id="s3j6cx"
git grep keyword
```

Example:

```bash id="0tbx7r"
git grep PyObject_GC_Track
```

## 89.5 Read Existing Tests First

Before writing new code, inspect existing tests.

Example:

```bash id="5o1ft8"
grep -R "dict" Lib/test/test_dict.py
```

Existing tests show:

```text id="gwtd4h"
expected behavior
edge cases
historical regressions
platform assumptions
testing conventions
```

Often the correct patch location becomes clearer after reading tests.

## 89.6 Write the Test Before the Fix

For regressions, write the failing test first.

Example workflow:

```text id="x2q90r"
reproduce bug
write minimal failing test
confirm failure
implement fix
confirm success
```

A regression test should fail before the patch and pass after it.

Example:

```python id="sjlwm6"
def test_resize_preserves_items(self):
    d = {}

    for i in range(1000):
        d[i] = i

    for i in range(1000):
        self.assertEqual(d[i], i)
```

Keep the test minimal. It should isolate the broken behavior.

## 89.7 Make Small Changes First

Prefer the smallest correct change.

Bad approach:

```text id="n0z9f4"
rewrite large subsystem
refactor unrelated code
rename many symbols
change formatting everywhere
fix bug simultaneously
```

Better:

```text id="3aqehn"
small targeted fix
minimal supporting cleanup
focused regression test
```

Small patches are easier to:

```text id="jizgdr"
review
debug
backport
bisect
revert
reason about
```

Large unrelated cleanup should usually be separate.

## 89.8 Preserve Existing Invariants

CPython internals depend on many invariants.

Examples:

```text id="jlwmrz"
valid reference counts
correct GC tracking state
exception set on NULL return
borrowed references remain valid
matching alloc/free domains
stable frame state during execution
```

When modifying runtime code, explicitly ask:

```text id="xq0xmo"
Who owns this reference?
Can this object be collected here?
Can this API fail?
What happens on error cleanup?
Is this object tracked by GC?
Can another thread observe this state?
```

Most CPython bugs come from violating hidden assumptions.

## 89.9 Rebuild Frequently

After changing C code:

```bash id="4zw6o0"
make -j8
```

Then run focused tests immediately.

Do not make many unrelated changes before rebuilding. Early failures are easier to diagnose.

## 89.10 Run Focused Tests First

After a small change:

```bash id="p4pk4d"
./python -m test -v test_dict
```

or:

```bash id="q3d2v4"
./python -m test -v test_gc
```

Use:

```bash id="bxq0cc"
-x
```

to stop on first failure:

```bash id="n2rjjx"
./python -m test -v -x test_gc
```

Fast iteration matters more than full-suite execution during early development.

## 89.11 Run Related Tests

After focused tests pass, run nearby tests.

Example:

```bash id="dfccql"
./python -m test -v test_dict test_set test_collections
```

Subsystem interactions matter.

A dict change may affect:

```text id="k2j3qt"
keyword arguments
class namespaces
globals
attribute dictionaries
import machinery
dataclasses
JSON behavior
```

## 89.12 Run Reference Leak Tests

If touching object lifetime or C code:

```bash id="e3f3jc"
./python -m test -R 3:3 test_name
```

Typical leak causes:

```text id="xv2bs1"
missing Py_DECREF
incorrect error cleanup
cached references
reference cycles
forgotten decref after ownership transfer
```

A patch that introduces leaks is incomplete.

## 89.13 Use a Debug Build

Always test internals work under:

```bash id="tifjsh"
./configure --with-pydebug
```

Debug builds expose:

```text id="4v9pgo"
assertion failures
GC inconsistencies
negative refcounts
allocator misuse
invalid object state
```

Release builds may hide these problems temporarily.

## 89.14 Use Sanitizers for Memory Bugs

For suspicious memory behavior:

```bash id="zj9v7n"
./configure --with-pydebug \
  CFLAGS="-O1 -g -fsanitize=address,undefined" \
  LDFLAGS="-fsanitize=address,undefined"
```

Run:

```bash id="r8zzvl"
ASAN_OPTIONS=abort_on_error=1:symbolize=1 \
./python -m test -v test_name
```

Sanitizers catch:

```text id="gafgll"
use-after-free
buffer overflow
invalid memory access
undefined behavior
```

## 89.15 Add Documentation Changes

User-visible behavior changes require documentation updates.

Examples:

| Change | Documentation |
|---|---|
| New stdlib behavior | `Doc/library/` |
| New syntax | `Doc/reference/` |
| New C API | `Doc/c-api/` |
| Changed CLI flag | `Doc/using/cmdline.rst` |
| Important feature | `Doc/whatsnew/` |

Build documentation locally:

```bash id="hyzjyd"
make -C Doc html
make -C Doc suspicious
```

Documentation is part of the patch, not a later cleanup step.

## 89.16 Add a News Entry

Most user-visible changes need a news entry.

Typical command:

```bash id="wx2u7q"
blurb add
```

Good entry:

```text id="lvhfcf"
Fix ``dict.update()`` incorrectly overwriting values when the source mapping mutates during iteration.
```

Weak entry:

```text id="ay5f7m"
Fix bug in dict.
```

The entry should describe the visible effect.

## 89.17 Keep Style Consistent

Follow surrounding style.

Examples:

```text id="fy7y4v"
indentation
brace placement
error handling patterns
goto cleanup conventions
naming
comment style
macro usage
```

Do not rewrite style in unrelated code.

CPython code favors consistency over personal preference.

## 89.18 Error Cleanup Patterns

Many CPython C functions use structured cleanup.

Example:

```c id="o22l8t"
PyObject *x = NULL;
PyObject *y = NULL;

x = make_x();
if (x == NULL) {
    goto error;
}

y = make_y();
if (y == NULL) {
    goto error;
}

return y;

error:
Py_XDECREF(x);
Py_XDECREF(y);
return NULL;
```

This pattern centralizes cleanup and reduces leak risk.

Avoid duplicated cleanup logic spread across many returns.

## 89.19 Do Not Ignore Failure Paths

Every allocation and API call can fail.

Examples:

```c id="72otjc"
PyLong_FromLong
PyUnicode_FromString
PyObject_Call
PyList_New
PyDict_New
PyTuple_New
```

Always check:

```c id="phdsvf"
if (obj == NULL) {
    return NULL;
}
```

A patch that handles only the success path is incomplete.

## 89.20 Commit Messages

A good commit message is concise and descriptive.

Good:

```text id="rz9vyy"
Fix reference leak in dict merge error path
```

Weak:

```text id="80z2hq"
fix stuff
```

The commit message should describe the semantic change, not the editing activity.

## 89.21 Run Broader Validation Before Submission

Before opening a pull request:

```bash id="jlt5ub"
./python -m test -j0
```

If the patch touches sensitive runtime paths:

```text id="b8yktk"
imports
GC
frames
dicts
compiler
interpreter loop
memory allocators
threading
```

run broader validation than usual.

A patch that passes only one focused test may still break unrelated behavior.

## 89.22 Read the Diff Carefully

Before submission:

```bash id="wjlwmn"
git diff
```

Check for:

```text id="d5kxb8"
debug prints
temporary instrumentation
commented-out code
accidental whitespace changes
unrelated formatting
forgotten test edits
generated files
```

A clean diff is easier to review.

## 89.23 Open the Pull Request

A pull request should explain:

```text id="ak8d3u"
what the problem is
why the current behavior is wrong
what the patch changes
how it was tested
whether compatibility changes exist
```

Good PR descriptions reduce reviewer guesswork.

For regressions, include a minimal reproducer.

For performance patches, include benchmarks.

For semantic changes, include rationale.

## 89.24 Responding to Review

Code review is part of development, not a separate obstacle.

Common review requests:

```text id="r6bh4k"
add regression test
simplify logic
handle failure path
improve comments
clarify ownership
update documentation
rename variable
reduce scope
```

Respond technically and precisely.

Good response:

```text id="xjlwm5"
This path can fail because PyObject_Call may trigger arbitrary Python code. I added cleanup for x before returning NULL.
```

Weak response:

```text id="lbr6tm"
I think it should work now.
```

## 89.25 Backports

Bug fixes may need backports to maintenance branches.

Typical flow:

```text id="v3ktfa"
merge into main
backport to supported branches if appropriate
```

Compatibility matters during backporting.

A patch safe for the development branch may be too risky for a maintenance release.

## 89.26 Common Patch Mistakes

| Mistake | Better approach |
|---|---|
| Large unrelated refactor | Small focused patch |
| No regression test | Add minimal reproducer test |
| Ignoring error cleanup | Audit all exits |
| Missing docs | Update docs with behavior |
| Style rewrite in unrelated code | Preserve local style |
| Only testing success path | Test failures too |
| Assuming allocations succeed | Check all API returns |
| Using system Python accidentally | Use local build |
| No leak testing | Run `-R` for runtime changes |

## 89.27 Example Patch Workflow

Example end-to-end workflow:

```text id="ygnol0"
1. Reproduce bug.
2. Reduce to minimal script.
3. Locate implementation.
4. Read existing tests.
5. Add failing regression test.
6. Build debug CPython.
7. Confirm failure.
8. Implement minimal fix.
9. Rebuild.
10. Run focused tests.
11. Run leak tests.
12. Run related tests.
13. Update docs if needed.
14. Add news entry.
15. Read final diff.
16. Open PR.
17. Respond to review.
```

This workflow scales from small fixes to major runtime work.

## 89.28 Core Principle

A CPython patch is a change to a living runtime system.

The code, tests, documentation, memory invariants, and public contracts evolve together. A correct patch fixes the problem, preserves surrounding invariants, explains the behavior clearly, and leaves the interpreter easier to trust than before.
