Skip to content

89. Making a CPython Patch

GitHub workflow: forking, branching, opening a PR, responding to review, and the CLA requirement.

A CPython patch is a coordinated change to the interpreter, runtime, standard library, tests, documentation, and development metadata. Even a small fix may affect object lifetime, bytecode behavior, platform compatibility, import state, or public APIs.

Making a good CPython patch requires more than changing code. The patch must preserve runtime invariants, include regression tests, follow repository conventions, build correctly across platforms, and explain user-visible behavior clearly.

89.1 Understand the Existing Behavior First

Before changing code, reproduce and understand the current behavior.

For a bug:

reproduce the failure
reduce to a minimal case
identify expected behavior
find where the incorrect behavior begins

For a feature:

understand current semantics
read related implementation paths
inspect tests
inspect documentation
identify compatibility constraints

A patch created before understanding the surrounding subsystem usually introduces secondary bugs.

89.2 Build CPython Locally

Always work from a local build.

Typical setup:

git clone https://github.com/python/cpython.git
cd cpython

./configure --with-pydebug
make -j8

Run:

./python

Do not use the system Python while modifying CPython internals.

The local interpreter ensures:

correct runtime
correct stdlib
correct extension modules
correct ABI
correct bytecode

89.3 Create a Branch

Use a dedicated branch.

Example:

git checkout -b fix-dict-resize

Good branch names are short and descriptive.

Examples:

fix-gc-cycle-crash
optimize-vectorcall-path
improve-import-error-message
add-test-for-recursion-limit

Avoid generic names such as:

patch
changes
work
update

89.4 Find the Relevant Code

CPython is large. Start by locating the relevant subsystem.

Useful directories:

DirectoryPurpose
Objects/Built-in object implementations
Python/Interpreter and compiler
Parser/Parsing
Modules/Built-in and extension modules
Lib/Standard library
Include/Public and internal headers
Doc/Documentation
Lib/test/Tests

Examples:

ProblemLikely area
Dict behaviorObjects/dictobject.c
List behaviorObjects/listobject.c
Bytecode executionPython/ceval.c
Compiler behaviorPython/compile.c
Import logicPython/import.c, Lib/importlib/
GC behaviorModules/gcmodule.c
Unicode internalsObjects/unicodeobject.c

Use:

git grep keyword

Example:

git grep PyObject_GC_Track

89.5 Read Existing Tests First

Before writing new code, inspect existing tests.

Example:

grep -R "dict" Lib/test/test_dict.py

Existing tests show:

expected behavior
edge cases
historical regressions
platform assumptions
testing conventions

Often the correct patch location becomes clearer after reading tests.

89.6 Write the Test Before the Fix

For regressions, write the failing test first.

Example workflow:

reproduce bug
write minimal failing test
confirm failure
implement fix
confirm success

A regression test should fail before the patch and pass after it.

Example:

def test_resize_preserves_items(self):
    d = {}

    for i in range(1000):
        d[i] = i

    for i in range(1000):
        self.assertEqual(d[i], i)

Keep the test minimal. It should isolate the broken behavior.

89.7 Make Small Changes First

Prefer the smallest correct change.

Bad approach:

rewrite large subsystem
refactor unrelated code
rename many symbols
change formatting everywhere
fix bug simultaneously

Better:

small targeted fix
minimal supporting cleanup
focused regression test

Small patches are easier to:

review
debug
backport
bisect
revert
reason about

Large unrelated cleanup should usually be separate.

89.8 Preserve Existing Invariants

CPython internals depend on many invariants.

Examples:

valid reference counts
correct GC tracking state
exception set on NULL return
borrowed references remain valid
matching alloc/free domains
stable frame state during execution

When modifying runtime code, explicitly ask:

Who owns this reference?
Can this object be collected here?
Can this API fail?
What happens on error cleanup?
Is this object tracked by GC?
Can another thread observe this state?

Most CPython bugs come from violating hidden assumptions.

89.9 Rebuild Frequently

After changing C code:

make -j8

Then run focused tests immediately.

Do not make many unrelated changes before rebuilding. Early failures are easier to diagnose.

89.10 Run Focused Tests First

After a small change:

./python -m test -v test_dict

or:

./python -m test -v test_gc

Use:

-x

to stop on first failure:

./python -m test -v -x test_gc

Fast iteration matters more than full-suite execution during early development.

89.11 Run Related Tests

After focused tests pass, run nearby tests.

Example:

./python -m test -v test_dict test_set test_collections

Subsystem interactions matter.

A dict change may affect:

keyword arguments
class namespaces
globals
attribute dictionaries
import machinery
dataclasses
JSON behavior

89.12 Run Reference Leak Tests

If touching object lifetime or C code:

./python -m test -R 3:3 test_name

Typical leak causes:

missing Py_DECREF
incorrect error cleanup
cached references
reference cycles
forgotten decref after ownership transfer

A patch that introduces leaks is incomplete.

89.13 Use a Debug Build

Always test internals work under:

./configure --with-pydebug

Debug builds expose:

assertion failures
GC inconsistencies
negative refcounts
allocator misuse
invalid object state

Release builds may hide these problems temporarily.

89.14 Use Sanitizers for Memory Bugs

For suspicious memory behavior:

./configure --with-pydebug \
  CFLAGS="-O1 -g -fsanitize=address,undefined" \
  LDFLAGS="-fsanitize=address,undefined"

Run:

ASAN_OPTIONS=abort_on_error=1:symbolize=1 \
./python -m test -v test_name

Sanitizers catch:

use-after-free
buffer overflow
invalid memory access
undefined behavior

89.15 Add Documentation Changes

User-visible behavior changes require documentation updates.

Examples:

ChangeDocumentation
New stdlib behaviorDoc/library/
New syntaxDoc/reference/
New C APIDoc/c-api/
Changed CLI flagDoc/using/cmdline.rst
Important featureDoc/whatsnew/

Build documentation locally:

make -C Doc html
make -C Doc suspicious

Documentation is part of the patch, not a later cleanup step.

89.16 Add a News Entry

Most user-visible changes need a news entry.

Typical command:

blurb add

Good entry:

Fix ``dict.update()`` incorrectly overwriting values when the source mapping mutates during iteration.

Weak entry:

Fix bug in dict.

The entry should describe the visible effect.

89.17 Keep Style Consistent

Follow surrounding style.

Examples:

indentation
brace placement
error handling patterns
goto cleanup conventions
naming
comment style
macro usage

Do not rewrite style in unrelated code.

CPython code favors consistency over personal preference.

89.18 Error Cleanup Patterns

Many CPython C functions use structured cleanup.

Example:

PyObject *x = NULL;
PyObject *y = NULL;

x = make_x();
if (x == NULL) {
    goto error;
}

y = make_y();
if (y == NULL) {
    goto error;
}

return y;

error:
Py_XDECREF(x);
Py_XDECREF(y);
return NULL;

This pattern centralizes cleanup and reduces leak risk.

Avoid duplicated cleanup logic spread across many returns.

89.19 Do Not Ignore Failure Paths

Every allocation and API call can fail.

Examples:

PyLong_FromLong
PyUnicode_FromString
PyObject_Call
PyList_New
PyDict_New
PyTuple_New

Always check:

if (obj == NULL) {
    return NULL;
}

A patch that handles only the success path is incomplete.

89.20 Commit Messages

A good commit message is concise and descriptive.

Good:

Fix reference leak in dict merge error path

Weak:

fix stuff

The commit message should describe the semantic change, not the editing activity.

89.21 Run Broader Validation Before Submission

Before opening a pull request:

./python -m test -j0

If the patch touches sensitive runtime paths:

imports
GC
frames
dicts
compiler
interpreter loop
memory allocators
threading

run broader validation than usual.

A patch that passes only one focused test may still break unrelated behavior.

89.22 Read the Diff Carefully

Before submission:

git diff

Check for:

debug prints
temporary instrumentation
commented-out code
accidental whitespace changes
unrelated formatting
forgotten test edits
generated files

A clean diff is easier to review.

89.23 Open the Pull Request

A pull request should explain:

what the problem is
why the current behavior is wrong
what the patch changes
how it was tested
whether compatibility changes exist

Good PR descriptions reduce reviewer guesswork.

For regressions, include a minimal reproducer.

For performance patches, include benchmarks.

For semantic changes, include rationale.

89.24 Responding to Review

Code review is part of development, not a separate obstacle.

Common review requests:

add regression test
simplify logic
handle failure path
improve comments
clarify ownership
update documentation
rename variable
reduce scope

Respond technically and precisely.

Good response:

This path can fail because PyObject_Call may trigger arbitrary Python code. I added cleanup for x before returning NULL.

Weak response:

I think it should work now.

89.25 Backports

Bug fixes may need backports to maintenance branches.

Typical flow:

merge into main
backport to supported branches if appropriate

Compatibility matters during backporting.

A patch safe for the development branch may be too risky for a maintenance release.

89.26 Common Patch Mistakes

MistakeBetter approach
Large unrelated refactorSmall focused patch
No regression testAdd minimal reproducer test
Ignoring error cleanupAudit all exits
Missing docsUpdate docs with behavior
Style rewrite in unrelated codePreserve local style
Only testing success pathTest failures too
Assuming allocations succeedCheck all API returns
Using system Python accidentallyUse local build
No leak testingRun -R for runtime changes

89.27 Example Patch Workflow

Example end-to-end workflow:

1. Reproduce bug.
2. Reduce to minimal script.
3. Locate implementation.
4. Read existing tests.
5. Add failing regression test.
6. Build debug CPython.
7. Confirm failure.
8. Implement minimal fix.
9. Rebuild.
10. Run focused tests.
11. Run leak tests.
12. Run related tests.
13. Update docs if needed.
14. Add news entry.
15. Read final diff.
16. Open PR.
17. Respond to review.

This workflow scales from small fixes to major runtime work.

89.28 Core Principle

A CPython patch is a change to a living runtime system.

The code, tests, documentation, memory invariants, and public contracts evolve together. A correct patch fixes the problem, preserves surrounding invariants, explains the behavior clearly, and leaves the interpreter easier to trust than before.