Skip to content

3. Repository Layout

Tour of the CPython source tree: Modules/, Objects/, Python/, Parser/, Include/, and the build system.

The CPython repository is organized around the major subsystems of the interpreter: object implementations, runtime machinery, compiler pipeline, parser, built-in modules, standard library, tests, documentation, and platform build files.

A good first pass is to treat the source tree as a map of responsibilities.

cpython/
    Include/
    Objects/
    Python/
    Parser/
    Modules/
    Lib/
    Programs/
    Tools/
    Doc/
    Grammar/
    PC/
    PCbuild/
    Mac/

Each directory has a different role. Some contain core runtime code. Some contain generated code. Some contain test fixtures. Some exist mainly for platform-specific builds.

3.1 Top-Level Structure

DirectoryMain role
Include/Public, internal, and private C headers
Objects/Implementations of core object types
Python/Runtime, compiler, interpreter loop, initialization
Parser/Tokenizer and parser support code
Grammar/Grammar input files
Modules/Built-in and extension modules written in C
Lib/Python standard library
Lib/test/CPython regression test suite
Programs/Executable entry points
Tools/Developer and build tools
Doc/Documentation source
PC/Windows-specific source and config files
PCbuild/Windows build system
Mac/macOS-specific support

The most important directories for internals reading are:

Include/
Objects/
Python/
Parser/
Modules/
Lib/test/

Those directories cover the object model, execution engine, compiler, parser, built-in types, C API, and tests.

3.2 Include/: C Header Files

Include/ contains the header files used by CPython itself, extension modules, and embedders.

A simplified layout:

Include/
    Python.h
    object.h
    unicodeobject.h
    listobject.h
    dictobject.h
    cpython/
    internal/

The most important file is:

#include <Python.h>

Python.h is the umbrella public header for extension modules. It includes many other public headers and exposes the C API most extension authors use.

Header categories

Header areaAudienceStability
Include/*.hPublic C API usersRelatively stable
Include/cpython/CPython-specific APILess portable
Include/internal/CPython internals onlyCan change freely

This distinction matters. Code inside CPython can include internal headers. Third-party extensions generally should not.

For example:

#include "Python.h"

is normal for extension modules.

But:

#include "internal/pycore_runtime.h"

is for CPython core code. It exposes internal runtime structures that are not part of the stable public API.

3.3 Objects/: Built-In Object Implementations

Objects/ contains the C implementations of many core Python object types.

Examples:

FileImplements
object.cBase object operations
typeobject.cType objects, classes, MRO, slots
longobject.cPython integers
floatobject.cPython floats
unicodeobject.cPython strings
bytesobject.cbytes
bytearrayobject.cbytearray
listobject.clist
tupleobject.ctuple
dictobject.cdict
setobject.cset and frozenset
funcobject.cFunction objects
methodobject.cBuilt-in method objects
moduleobject.cModule objects
genobject.cGenerators and coroutines
frameobject.cFrame object support
codeobject.cCode objects
cellobject.cClosure cells
descrobject.cDescriptors

This directory is the best place to study how Python values are represented and operated on.

For example, list behavior lives mainly in:

Objects/listobject.c

Dictionary behavior lives mainly in:

Objects/dictobject.c

String behavior lives mainly in:

Objects/unicodeobject.c

When Python runs:

items = []
items.append(1)

the underlying list allocation, resizing, method lookup, and append operation eventually involve code in Objects/listobject.c and type machinery in Objects/typeobject.c.

3.4 Python/: Runtime, Compiler, and Interpreter Core

Python/ contains much of CPython’s central machinery.

Important files include:

FileRole
ceval.cBytecode evaluation loop
bytecodes.cBytecode instruction definitions in modern CPython
compile.cAST to code object compiler
symtable.cSymbol table analysis
ast.cAST support
pythonrun.cHigh-level execution entry points
pylifecycle.cRuntime initialization and finalization
import.cImport support
errors.cException state and error APIs
traceback.cTraceback support
sysmodule.cImplementation of sys
bltinmodule.cBuilt-in functions and builtins module
marshal.cInternal serialization format for code objects
thread.cThread abstraction layer
context.cContext variable support
bootstrap_hash.cHash secret initialization

The file names are not merely labels. They reflect deep runtime subsystems.

Interpreter execution

The bytecode interpreter is centered around the evaluation loop. Historically this is associated with ceval.c. In newer CPython versions, opcode definitions and generated interpreter pieces may be split across additional files.

Conceptual role:

frame enters evaluation
bytecode instruction fetched
instruction dispatch
object operation
stack and frame state updated

Compiler pipeline

The compiler code lowers AST nodes into code objects.

A simplified path:

source text
tokens
parse tree
AST
symbol table
compiler
code object

The key files are usually:

Parser/
Python/ast.c
Python/symtable.c
Python/compile.c
Objects/codeobject.c

3.5 Parser/: Tokenizer and Parser Support

Parser/ contains the tokenizer, parser generator support, parser implementation files, and generated parser-related code.

Important areas include:

AreaRole
tokenizerConverts source text into tokens
parserBuilds syntax structures from tokens
PEG machinerySupports Python’s PEG parser
generated parser filesProduced from grammar definitions

The parser’s job is to decide whether source text is valid Python syntax and to build the structures that later become an AST.

Example:

x = 1 + 2

The parser needs to recognize:

assignment statement
target name x
expression 1 + 2
integer literal 1
integer literal 2
binary addition operator

Parsing precedes semantic analysis. The parser knows syntax. The symbol table pass later decides whether names are local, global, free, or cell variables.

3.6 Grammar/: Grammar Definitions

Grammar/ contains grammar input files used to generate parser-related code.

The grammar defines Python syntax in a form consumed by CPython’s parser tooling.

For internals work, grammar changes are high impact. A syntax change can affect:

parser generation
AST generation
compiler behavior
error messages
tests
documentation
tools that parse Python

A grammar-level change often requires regeneration and targeted tests.

The usual workflow is:

edit grammar input
regenerate parser files
rebuild CPython
run parser, AST, compiler, and syntax tests

3.7 Modules/: Built-In and Extension Modules

Modules/ contains many C modules shipped with CPython.

Examples:

File or directoryModule
_io/I/O implementation
_decimal/Decimal accelerator
_sqlite/SQLite module
_ssl.cSSL support
_hashopenssl.cHashing with OpenSSL
_ctypes/ctypes
arraymodule.carray
mathmodule.cmath
itertoolsmodule.citertools
functoolsmodule.c_functools
posixmodule.cos platform operations
timemodule.ctime

Some standard library modules are written in Python and use C accelerators from Modules/.

For example, a public Python module may live in Lib/, while a private performance-critical helper lives in Modules/.

This pattern gives CPython a clean public API while preserving fast C implementations for hot paths.

3.8 Lib/: Standard Library

Lib/ contains the Python standard library.

Examples:

PathRole
Lib/os.pyOS interface layer
Lib/pathlib/Object-oriented paths
Lib/importlib/Import system implementation
Lib/asyncio/Async I/O framework
Lib/collections/Collection utilities
Lib/dataclasses.pyDataclass support
Lib/typing.pyTyping support
Lib/unittest/Unit testing framework
Lib/json/JSON implementation
Lib/concurrent/Futures and process/thread pools

Many CPython internals are easier to understand by reading the Python-level standard library first.

For example, importlib contains much of the import system in Python code. CPython bootstraps it specially, but large parts remain readable as Python.

3.9 Lib/test/: Regression Tests

Lib/test/ contains CPython’s test suite.

This directory is essential for internals work.

Examples:

Test fileFocus
test_dict.pyDictionary behavior
test_list.pyList behavior
test_gc.pyGarbage collector
test_sys.pysys module
test_dis.pyBytecode disassembly
test_compile.pyCompiler behavior
test_ast.pyAST behavior
test_importlib/Import system
test_capi/C API behavior
test_threading.pyThreading behavior

When studying a subsystem, pair the implementation file with its tests.

ImplementationTests
Objects/dictobject.cLib/test/test_dict.py
Objects/listobject.cLib/test/test_list.py
Python/compile.cLib/test/test_compile.py
Python/symtable.cLib/test/test_symtable.py
Python/sysmodule.cLib/test/test_sys.py
Modules/mathmodule.cLib/test/test_math.py

This habit prevents reading code in isolation. CPython behavior is defined by code plus tests plus documentation plus compatibility expectations.

3.10 Programs/: Executable Entry Points

Programs/ contains source files for CPython executable programs.

Typical files include:

Programs/python.c
Programs/_testembed.c

Programs/python.c is the normal command-line interpreter entry point.

A simplified startup path looks like:

main()
initialize runtime
configure interpreter
run command, script, module, stdin, or REPL
finalize runtime

This directory is useful when studying startup, embedding, command-line options, and interpreter initialization.

3.11 Tools/: Developer Tools

Tools/ contains helper programs for CPython development.

Examples include tools for:

generated file maintenance
bytecode and opcode metadata
C API inspection
test support
build support
freeze tooling
scripts used by maintainers

The exact contents change over time. The important rule is that many generated files in CPython have source inputs and regeneration tools. Tools/ is often where those tools live.

When changing grammar, Argument Clinic blocks, opcode definitions, or generated metadata, check the relevant tool workflow before editing generated output.

3.12 Doc/: Documentation Source

Doc/ contains CPython’s documentation source.

The documentation covers:

language reference
library reference
C API reference
extending and embedding
how-to guides
tutorial
installing and using Python

For internals work, the documentation matters because changes to behavior often require documentation updates.

A CPython change may require edits in several places:

implementation code
tests
documentation
news entry
C API docs
library docs

Documentation source uses reStructuredText rather than Markdown.

3.13 Platform Directories

CPython supports many platforms. Some directories exist mainly for platform-specific configuration and builds.

DirectoryPlatform role
PC/Windows-specific source/config
PCbuild/Visual Studio build files
Mac/macOS support
platform files in Python/ and Modules/OS-specific implementations

Platform support appears throughout the tree through conditional compilation.

Example pattern:

#ifdef MS_WINDOWS
    /* Windows-specific code */
#else
    /* POSIX-like code */
#endif

Internals reading often requires distinguishing portable runtime logic from platform-specific branches.

3.14 Generated Code and Source Inputs

CPython contains both hand-written files and generated files.

Common generated areas include:

parser output from grammar
AST-related generated files
opcode metadata and bytecode tables
Argument Clinic output
frozen importlib modules
configuration files

A generated file usually has a comment near the top explaining how it was produced.

Before editing a suspiciously mechanical block, look for markers such as:

generated by
do not edit
clinic start generated code
autogenerated

Manual edits to generated regions are usually lost during regeneration.

3.15 A Reading Path Through the Repository

A good first reading path is:

Programs/python.c
Python/pylifecycle.c
Python/pythonrun.c
Python/compile.c
Python/ceval.c
Objects/object.c
Objects/typeobject.c
Objects/dictobject.c
Objects/listobject.c

This path follows execution from process startup to runtime behavior.

A second reading path for object internals:

Include/object.h
Include/cpython/object.h
Objects/object.c
Objects/typeobject.c
Objects/longobject.c
Objects/unicodeobject.c
Objects/listobject.c
Objects/dictobject.c

A third path for source-to-bytecode:

Grammar/
Parser/
Python/ast.c
Python/symtable.c
Python/compile.c
Objects/codeobject.c
Python/ceval.c

3.16 How to Find Code for a Python Feature

Start from the Python feature, then map it to a subsystem.

FeatureFirst files to inspect
list.appendObjects/listobject.c
dict[key]Objects/dictobject.c
x.yObjects/object.c, Objects/typeobject.c
class C:Objects/typeobject.c, Python/compile.c
try/exceptPython/compile.c, Python/ceval.c
import xLib/importlib/, Python/import.c
async defPython/compile.c, Objects/genobject.c
withPython/compile.c, Python/ceval.c
len(x)Python/bltinmodule.c, type slots
print(x)Python/bltinmodule.c, file I/O modules

Use tests to confirm behavior:

./python -m test test_dict
./python -m test test_descr
./python -m test test_importlib

3.17 Repository Layout as Architecture

The directory structure reflects CPython’s architecture.

Include/    exposes C interfaces
Objects/    defines runtime values
Python/     executes and manages programs
Parser/     understands syntax
Modules/    provides C-backed modules
Lib/        provides Python-level standard library
Lib/test/   protects behavior
Programs/   starts the executable
Tools/      maintains generated and developer workflows
Doc/        explains public behavior

This layout is not perfect. Older code, platform constraints, backward compatibility, and generated files create exceptions. Still, the structure is coherent enough to guide source reading.

3.18 Chapter Summary

The CPython repository is a working system, not a collection of isolated files. Objects/ defines what Python values are. Python/ defines how programs compile and execute. Parser/ and Grammar/ define syntax handling. Modules/ and Lib/ provide the standard library. Include/ exposes C interfaces. Lib/test/ defines much of the regression safety net.

A productive reader moves between implementation, tests, and documentation. CPython internals become easier once each source file has a clear place in the runtime architecture.