Tour of the CPython source tree: Modules/, Objects/, Python/, Parser/, Include/, and the build system.
The CPython repository is organized around the major subsystems of the interpreter: object implementations, runtime machinery, compiler pipeline, parser, built-in modules, standard library, tests, documentation, and platform build files.
A good first pass is to treat the source tree as a map of responsibilities.
cpython/
Include/
Objects/
Python/
Parser/
Modules/
Lib/
Programs/
Tools/
Doc/
Grammar/
PC/
PCbuild/
Mac/Each directory has a different role. Some contain core runtime code. Some contain generated code. Some contain test fixtures. Some exist mainly for platform-specific builds.
3.1 Top-Level Structure
| Directory | Main role |
|---|---|
Include/ | Public, internal, and private C headers |
Objects/ | Implementations of core object types |
Python/ | Runtime, compiler, interpreter loop, initialization |
Parser/ | Tokenizer and parser support code |
Grammar/ | Grammar input files |
Modules/ | Built-in and extension modules written in C |
Lib/ | Python standard library |
Lib/test/ | CPython regression test suite |
Programs/ | Executable entry points |
Tools/ | Developer and build tools |
Doc/ | Documentation source |
PC/ | Windows-specific source and config files |
PCbuild/ | Windows build system |
Mac/ | macOS-specific support |
The most important directories for internals reading are:
Include/
Objects/
Python/
Parser/
Modules/
Lib/test/Those directories cover the object model, execution engine, compiler, parser, built-in types, C API, and tests.
3.2 Include/: C Header Files
Include/ contains the header files used by CPython itself, extension modules, and embedders.
A simplified layout:
Include/
Python.h
object.h
unicodeobject.h
listobject.h
dictobject.h
cpython/
internal/The most important file is:
#include <Python.h>Python.h is the umbrella public header for extension modules. It includes many other public headers and exposes the C API most extension authors use.
Header categories
| Header area | Audience | Stability |
|---|---|---|
Include/*.h | Public C API users | Relatively stable |
Include/cpython/ | CPython-specific API | Less portable |
Include/internal/ | CPython internals only | Can change freely |
This distinction matters. Code inside CPython can include internal headers. Third-party extensions generally should not.
For example:
#include "Python.h"is normal for extension modules.
But:
#include "internal/pycore_runtime.h"is for CPython core code. It exposes internal runtime structures that are not part of the stable public API.
3.3 Objects/: Built-In Object Implementations
Objects/ contains the C implementations of many core Python object types.
Examples:
| File | Implements |
|---|---|
object.c | Base object operations |
typeobject.c | Type objects, classes, MRO, slots |
longobject.c | Python integers |
floatobject.c | Python floats |
unicodeobject.c | Python strings |
bytesobject.c | bytes |
bytearrayobject.c | bytearray |
listobject.c | list |
tupleobject.c | tuple |
dictobject.c | dict |
setobject.c | set and frozenset |
funcobject.c | Function objects |
methodobject.c | Built-in method objects |
moduleobject.c | Module objects |
genobject.c | Generators and coroutines |
frameobject.c | Frame object support |
codeobject.c | Code objects |
cellobject.c | Closure cells |
descrobject.c | Descriptors |
This directory is the best place to study how Python values are represented and operated on.
For example, list behavior lives mainly in:
Objects/listobject.cDictionary behavior lives mainly in:
Objects/dictobject.cString behavior lives mainly in:
Objects/unicodeobject.cWhen Python runs:
items = []
items.append(1)the underlying list allocation, resizing, method lookup, and append operation eventually involve code in Objects/listobject.c and type machinery in Objects/typeobject.c.
3.4 Python/: Runtime, Compiler, and Interpreter Core
Python/ contains much of CPython’s central machinery.
Important files include:
| File | Role |
|---|---|
ceval.c | Bytecode evaluation loop |
bytecodes.c | Bytecode instruction definitions in modern CPython |
compile.c | AST to code object compiler |
symtable.c | Symbol table analysis |
ast.c | AST support |
pythonrun.c | High-level execution entry points |
pylifecycle.c | Runtime initialization and finalization |
import.c | Import support |
errors.c | Exception state and error APIs |
traceback.c | Traceback support |
sysmodule.c | Implementation of sys |
bltinmodule.c | Built-in functions and builtins module |
marshal.c | Internal serialization format for code objects |
thread.c | Thread abstraction layer |
context.c | Context variable support |
bootstrap_hash.c | Hash secret initialization |
The file names are not merely labels. They reflect deep runtime subsystems.
Interpreter execution
The bytecode interpreter is centered around the evaluation loop. Historically this is associated with ceval.c. In newer CPython versions, opcode definitions and generated interpreter pieces may be split across additional files.
Conceptual role:
frame enters evaluation
↓
bytecode instruction fetched
↓
instruction dispatch
↓
object operation
↓
stack and frame state updatedCompiler pipeline
The compiler code lowers AST nodes into code objects.
A simplified path:
source text
↓
tokens
↓
parse tree
↓
AST
↓
symbol table
↓
compiler
↓
code objectThe key files are usually:
Parser/
Python/ast.c
Python/symtable.c
Python/compile.c
Objects/codeobject.c3.5 Parser/: Tokenizer and Parser Support
Parser/ contains the tokenizer, parser generator support, parser implementation files, and generated parser-related code.
Important areas include:
| Area | Role |
|---|---|
| tokenizer | Converts source text into tokens |
| parser | Builds syntax structures from tokens |
| PEG machinery | Supports Python’s PEG parser |
| generated parser files | Produced from grammar definitions |
The parser’s job is to decide whether source text is valid Python syntax and to build the structures that later become an AST.
Example:
x = 1 + 2The parser needs to recognize:
assignment statement
target name x
expression 1 + 2
integer literal 1
integer literal 2
binary addition operatorParsing precedes semantic analysis. The parser knows syntax. The symbol table pass later decides whether names are local, global, free, or cell variables.
3.6 Grammar/: Grammar Definitions
Grammar/ contains grammar input files used to generate parser-related code.
The grammar defines Python syntax in a form consumed by CPython’s parser tooling.
For internals work, grammar changes are high impact. A syntax change can affect:
parser generation
AST generation
compiler behavior
error messages
tests
documentation
tools that parse PythonA grammar-level change often requires regeneration and targeted tests.
The usual workflow is:
edit grammar input
regenerate parser files
rebuild CPython
run parser, AST, compiler, and syntax tests3.7 Modules/: Built-In and Extension Modules
Modules/ contains many C modules shipped with CPython.
Examples:
| File or directory | Module |
|---|---|
_io/ | I/O implementation |
_decimal/ | Decimal accelerator |
_sqlite/ | SQLite module |
_ssl.c | SSL support |
_hashopenssl.c | Hashing with OpenSSL |
_ctypes/ | ctypes |
arraymodule.c | array |
mathmodule.c | math |
itertoolsmodule.c | itertools |
functoolsmodule.c | _functools |
posixmodule.c | os platform operations |
timemodule.c | time |
Some standard library modules are written in Python and use C accelerators from Modules/.
For example, a public Python module may live in Lib/, while a private performance-critical helper lives in Modules/.
This pattern gives CPython a clean public API while preserving fast C implementations for hot paths.
3.8 Lib/: Standard Library
Lib/ contains the Python standard library.
Examples:
| Path | Role |
|---|---|
Lib/os.py | OS interface layer |
Lib/pathlib/ | Object-oriented paths |
Lib/importlib/ | Import system implementation |
Lib/asyncio/ | Async I/O framework |
Lib/collections/ | Collection utilities |
Lib/dataclasses.py | Dataclass support |
Lib/typing.py | Typing support |
Lib/unittest/ | Unit testing framework |
Lib/json/ | JSON implementation |
Lib/concurrent/ | Futures and process/thread pools |
Many CPython internals are easier to understand by reading the Python-level standard library first.
For example, importlib contains much of the import system in Python code. CPython bootstraps it specially, but large parts remain readable as Python.
3.9 Lib/test/: Regression Tests
Lib/test/ contains CPython’s test suite.
This directory is essential for internals work.
Examples:
| Test file | Focus |
|---|---|
test_dict.py | Dictionary behavior |
test_list.py | List behavior |
test_gc.py | Garbage collector |
test_sys.py | sys module |
test_dis.py | Bytecode disassembly |
test_compile.py | Compiler behavior |
test_ast.py | AST behavior |
test_importlib/ | Import system |
test_capi/ | C API behavior |
test_threading.py | Threading behavior |
When studying a subsystem, pair the implementation file with its tests.
| Implementation | Tests |
|---|---|
Objects/dictobject.c | Lib/test/test_dict.py |
Objects/listobject.c | Lib/test/test_list.py |
Python/compile.c | Lib/test/test_compile.py |
Python/symtable.c | Lib/test/test_symtable.py |
Python/sysmodule.c | Lib/test/test_sys.py |
Modules/mathmodule.c | Lib/test/test_math.py |
This habit prevents reading code in isolation. CPython behavior is defined by code plus tests plus documentation plus compatibility expectations.
3.10 Programs/: Executable Entry Points
Programs/ contains source files for CPython executable programs.
Typical files include:
Programs/python.c
Programs/_testembed.cPrograms/python.c is the normal command-line interpreter entry point.
A simplified startup path looks like:
main()
↓
initialize runtime
↓
configure interpreter
↓
run command, script, module, stdin, or REPL
↓
finalize runtimeThis directory is useful when studying startup, embedding, command-line options, and interpreter initialization.
3.11 Tools/: Developer Tools
Tools/ contains helper programs for CPython development.
Examples include tools for:
generated file maintenance
bytecode and opcode metadata
C API inspection
test support
build support
freeze tooling
scripts used by maintainersThe exact contents change over time. The important rule is that many generated files in CPython have source inputs and regeneration tools. Tools/ is often where those tools live.
When changing grammar, Argument Clinic blocks, opcode definitions, or generated metadata, check the relevant tool workflow before editing generated output.
3.12 Doc/: Documentation Source
Doc/ contains CPython’s documentation source.
The documentation covers:
language reference
library reference
C API reference
extending and embedding
how-to guides
tutorial
installing and using PythonFor internals work, the documentation matters because changes to behavior often require documentation updates.
A CPython change may require edits in several places:
implementation code
tests
documentation
news entry
C API docs
library docsDocumentation source uses reStructuredText rather than Markdown.
3.13 Platform Directories
CPython supports many platforms. Some directories exist mainly for platform-specific configuration and builds.
| Directory | Platform role |
|---|---|
PC/ | Windows-specific source/config |
PCbuild/ | Visual Studio build files |
Mac/ | macOS support |
platform files in Python/ and Modules/ | OS-specific implementations |
Platform support appears throughout the tree through conditional compilation.
Example pattern:
#ifdef MS_WINDOWS
/* Windows-specific code */
#else
/* POSIX-like code */
#endifInternals reading often requires distinguishing portable runtime logic from platform-specific branches.
3.14 Generated Code and Source Inputs
CPython contains both hand-written files and generated files.
Common generated areas include:
parser output from grammar
AST-related generated files
opcode metadata and bytecode tables
Argument Clinic output
frozen importlib modules
configuration filesA generated file usually has a comment near the top explaining how it was produced.
Before editing a suspiciously mechanical block, look for markers such as:
generated by
do not edit
clinic start generated code
autogeneratedManual edits to generated regions are usually lost during regeneration.
3.15 A Reading Path Through the Repository
A good first reading path is:
Programs/python.c
↓
Python/pylifecycle.c
↓
Python/pythonrun.c
↓
Python/compile.c
↓
Python/ceval.c
↓
Objects/object.c
↓
Objects/typeobject.c
↓
Objects/dictobject.c
↓
Objects/listobject.cThis path follows execution from process startup to runtime behavior.
A second reading path for object internals:
Include/object.h
↓
Include/cpython/object.h
↓
Objects/object.c
↓
Objects/typeobject.c
↓
Objects/longobject.c
↓
Objects/unicodeobject.c
↓
Objects/listobject.c
↓
Objects/dictobject.cA third path for source-to-bytecode:
Grammar/
↓
Parser/
↓
Python/ast.c
↓
Python/symtable.c
↓
Python/compile.c
↓
Objects/codeobject.c
↓
Python/ceval.c3.16 How to Find Code for a Python Feature
Start from the Python feature, then map it to a subsystem.
| Feature | First files to inspect |
|---|---|
list.append | Objects/listobject.c |
dict[key] | Objects/dictobject.c |
x.y | Objects/object.c, Objects/typeobject.c |
class C: | Objects/typeobject.c, Python/compile.c |
try/except | Python/compile.c, Python/ceval.c |
import x | Lib/importlib/, Python/import.c |
async def | Python/compile.c, Objects/genobject.c |
with | Python/compile.c, Python/ceval.c |
len(x) | Python/bltinmodule.c, type slots |
print(x) | Python/bltinmodule.c, file I/O modules |
Use tests to confirm behavior:
./python -m test test_dict
./python -m test test_descr
./python -m test test_importlib3.17 Repository Layout as Architecture
The directory structure reflects CPython’s architecture.
Include/ exposes C interfaces
Objects/ defines runtime values
Python/ executes and manages programs
Parser/ understands syntax
Modules/ provides C-backed modules
Lib/ provides Python-level standard library
Lib/test/ protects behavior
Programs/ starts the executable
Tools/ maintains generated and developer workflows
Doc/ explains public behaviorThis layout is not perfect. Older code, platform constraints, backward compatibility, and generated files create exceptions. Still, the structure is coherent enough to guide source reading.
3.18 Chapter Summary
The CPython repository is a working system, not a collection of isolated files. Objects/ defines what Python values are. Python/ defines how programs compile and execute. Parser/ and Grammar/ define syntax handling. Modules/ and Lib/ provide the standard library. Include/ exposes C interfaces. Lib/test/ defines much of the regression safety net.
A productive reader moves between implementation, tests, and documentation. CPython internals become easier once each source file has a clear place in the runtime architecture.