The pymalloc small-object allocator, arenas, pools, blocks, and when CPython falls back to system malloc.
CPython allocates memory constantly. Every integer object, list object, frame, tuple, dict entry array, string buffer, code object, exception, module, and function needs memory. The allocator system exists to make these allocations fast, structured, debuggable, and portable across platforms.
CPython does not use only one allocator. It uses several allocator domains and layers. Small Python objects often go through CPython’s specialized small-object allocator, while larger buffers may go through the platform allocator.
11.1 Why CPython Has Its Own Allocator
A Python program creates many short-lived objects.
for i in range(1_000_000):
x = (i, i + 1)This loop allocates many tuple objects and integer references. If every small object allocation went directly to the system malloc, the overhead would be high.
CPython’s allocator system improves this by:
serving small object allocations quickly
grouping small allocations into arenas and pools
reducing calls into the platform allocator
supporting debug hooks
separating allocator domains
making object allocation behavior predictable enough for internals workThe allocator is a performance-critical subsystem. It sits below most of the runtime.
11.2 Allocator Domains
CPython separates memory allocation into domains.
The important domains are:
| Domain | Typical use |
|---|---|
| Raw memory | Low-level memory independent of Python object state |
| Memory | General-purpose Python runtime memory |
| Object memory | Python object allocation |
At the C API level, these appear as families of functions:
PyMem_RawMalloc
PyMem_RawCalloc
PyMem_RawRealloc
PyMem_RawFree
PyMem_Malloc
PyMem_Calloc
PyMem_Realloc
PyMem_Free
PyObject_Malloc
PyObject_Calloc
PyObject_Realloc
PyObject_FreeThe distinction matters because each domain can have different hooks, constraints, and debug behavior.
A simple rule:
PyMem_Raw*
use for memory that may be allocated without an initialized Python runtime
PyMem_*
use for general Python memory
PyObject_*
use for memory belonging to Python objectsExtension code should match allocation and free functions from the same family.
Correct:
char *p = PyMem_Malloc(128);
if (p == NULL) {
return PyErr_NoMemory();
}
/* use p */
PyMem_Free(p);Incorrect:
char *p = PyMem_Malloc(128);
free(p); /* wrong allocator family */Mixing allocator families can corrupt memory.
11.3 Object Allocation vs Object Initialization
Allocation reserves memory. Initialization gives that memory a valid object state.
For Python objects, this distinction is important.
Object allocation:
reserve memory for object layout
set object header
set type pointer
set reference count
possibly track with GCObject initialization:
fill fields
store references
validate arguments
establish invariantsFor a user-defined class:
obj = MyClass(1, 2)the rough process is:
call type machinery
allocate memory for instance
initialize object header
call __new__
call __init__
return initialized objectFor C extension types, allocation usually goes through the type object:
self = (MyObject *)type->tp_alloc(type, 0);Then initialization fills fields.
11.4 tp_alloc and tp_free
Every type object can specify how instances are allocated and freed.
Important type slots:
tp_alloc
tp_free
tp_dealloctp_alloc reserves memory for a new object.
tp_free releases memory for the object.
tp_dealloc is the type-specific destructor. It usually releases fields and then calls tp_free.
Simplified shape:
static void
MyObject_dealloc(MyObject *self)
{
Py_XDECREF(self->value);
Py_TYPE(self)->tp_free((PyObject *)self);
}The deallocator is responsible for object-specific cleanup. The free function releases the memory block.
This separation lets different object kinds use different allocation strategies while keeping type-specific cleanup explicit.
11.5 Small Object Allocator
CPython’s small object allocator is commonly called pymalloc.
It is optimized for small memory blocks used by Python objects.
Conceptually:
arena
large region obtained from system allocator
pool
fixed-size subdivision inside an arena
block
one small allocation served to CPythonThe hierarchy:
system allocator
↓
arenas
↓
pools
↓
blocksSmall allocations are rounded into size classes. A pool serves blocks of one size class.
This avoids asking the system allocator for every small object.
11.6 Arenas
An arena is a large memory region obtained from the underlying allocator.
Conceptually:
arena
pool
pool
pool
...Arenas let CPython manage many small object allocations in batches.
When CPython needs more memory for small objects, it requests an arena. That arena is divided into pools. Pools are used to serve blocks.
An arena can be returned to the system only when all pools inside it become free. This means memory may remain reserved by CPython even after many objects are destroyed.
That behavior can surprise users:
objects were freed
process RSS did not immediately shrinkThis does not always indicate a leak. The memory may be held by the allocator for reuse.
11.7 Pools
A pool is a subdivision of an arena.
Each pool serves one block size class at a time.
For example:
pool A
32-byte blocks
pool B
64-byte blocks
pool C
128-byte blocksWhen a pool is assigned to a size class, all blocks in that pool have the same size. This makes allocation and free operations simple.
The allocator can maintain lists of pools with available blocks. Allocating a small object often means taking the next available block from a pool.
11.8 Blocks and Size Classes
A block is the memory returned for one allocation request.
Small allocation sizes are rounded up to size classes.
Example concept:
request 37 bytes
rounded to 40 or 48 byte class depending on allocator rules
request 72 bytes
rounded to matching size class
request too large
bypass pymalloc and use larger allocator pathThe exact size classes depend on CPython version and build configuration.
The important idea:
small requests use fixed-size pools
large requests use another allocator pathFixed-size pools make allocation faster and reduce fragmentation inside small object workloads.
11.9 Free Lists
Some object types use free lists in addition to the general allocator.
A free list caches recently destroyed objects of a specific type so they can be reused quickly.
Common examples across CPython history include frames, tuples of certain sizes, floats, lists, and other internal objects, though exact free-list usage changes by version.
Conceptual flow:
destroy object
if type-specific free list has room:
put object memory on free list
else:
return memory to allocator
create object
if free list has cached object:
reuse it
else:
allocate new memoryFree lists trade memory retention for speed.
They can make object allocation much faster in tight loops, but they also mean freed objects may not immediately return memory to the allocator.
11.10 Interning and Object Reuse
Some objects are reused intentionally.
Examples include:
None
True
False
small integers
some strings
empty tuple
interned identifiersThis reuse reduces allocation pressure and enables faster comparisons in some internal paths.
Example:
a = "name"
b = "name"Depending on how strings are created, CPython may intern them. Interned strings are useful for identifiers, attribute names, and dictionary keys used internally.
Object reuse is an optimization. Python code should not rely on object identity except for documented singletons such as None, True, False, NotImplemented, and Ellipsis.
Correct:
if value is None:
...Avoid:
if x is 1000:
...The second relies on implementation-specific object reuse behavior.
11.11 Immortal Objects and Allocation
Modern CPython uses immortal objects for selected runtime-owned objects.
An immortal object is treated as permanently alive. Reference count operations may avoid ordinary lifetime effects for it.
This affects allocation indirectly:
some fundamental objects are allocated once
their lifetime is the runtime lifetime
normal deallocation never frees themExamples may include singleton-like objects and heavily reused internal constants.
For extension authors, the rule stays unchanged:
use Py_INCREF and Py_DECREF
do not manually inspect or change ob_refcnt
do not assume ordinary deallocation for every objectCorrect code works whether an object is mortal or immortal.
11.12 Memory Fragmentation
Memory fragmentation happens when free memory exists but is split into pieces that cannot satisfy larger allocation requests or cannot be returned to the operating system cleanly.
CPython can experience fragmentation at several levels:
inside pymalloc pools
inside arenas
inside the system allocator
inside type-specific free lists
inside long-lived Python containersExample pattern:
items = []
for i in range(1_000_000):
items.append(bytearray(100))
del itemsThe Python objects may be destroyed, but memory behavior depends on object sizes, allocator paths, arena fullness, free lists, and system allocator behavior.
RSS may stay high because CPython expects to reuse memory later.
11.13 tracemalloc
tracemalloc traces Python memory allocations.
Example:
import tracemalloc
tracemalloc.start()
data = [str(i) for i in range(100_000)]
current, peak = tracemalloc.get_traced_memory()
print(current, peak)
tracemalloc.stop()It can show where memory was allocated:
import tracemalloc
tracemalloc.start()
data = [bytes(1024) for _ in range(1000)]
snapshot = tracemalloc.take_snapshot()
stats = snapshot.statistics("lineno")
for stat in stats[:10]:
print(stat)tracemalloc is useful for Python-level allocation debugging. It does not show every native allocation made by every C library.
11.14 Debug Allocator Hooks
CPython supports debug memory hooks that help find allocator misuse.
Debug builds and debug allocator modes can detect problems such as:
writing before allocated memory
writing after allocated memory
using memory after free
freeing memory with wrong allocator family
double free
uninitialized memory patternsThese tools are important when developing CPython itself or writing C extensions.
Debug hooks often add padding bytes around allocations and fill memory with recognizable byte patterns.
This makes memory corruption easier to catch near the source.
11.15 Allocator Family Discipline
Allocator family discipline is strict.
Correct pairs:
| Allocate | Free |
|---|---|
PyMem_RawMalloc | PyMem_RawFree |
PyMem_Malloc | PyMem_Free |
PyObject_Malloc | PyObject_Free |
malloc | free |
Incorrect pairings are bugs:
void *p = PyObject_Malloc(64);
PyMem_Free(p); /* wrong */Also wrong:
void *p = malloc(64);
PyObject_Free(p); /* wrong */The allocator that creates the memory must be the allocator that frees it.
11.16 Allocating Buffers in Extension Code
For non-object buffers, prefer the appropriate Python allocator family.
Example:
typedef struct {
PyObject_HEAD
char *data;
Py_ssize_t size;
} BufferObject;Allocation:
self->data = PyMem_Malloc(size);
if (self->data == NULL) {
PyErr_NoMemory();
return -1;
}
self->size = size;Deallocation:
static void
Buffer_dealloc(BufferObject *self)
{
PyMem_Free(self->data);
Py_TYPE(self)->tp_free((PyObject *)self);
}The buffer uses PyMem_*. The object itself uses the type’s tp_alloc and tp_free.
Keep these lifetimes separate.
11.17 Object Memory and Contained References
Allocating object memory does not automatically manage contained Python references.
Example:
typedef struct {
PyObject_HEAD
PyObject *value;
} BoxObject;Allocation gives memory for value, but it does not own a valid reference yet.
Initialization must set it safely:
self->value = NULL;Then assign owned references with Py_INCREF or by receiving a stolen/new reference according to the API contract.
Deallocation must release the owned reference:
Py_XDECREF(self->value);Memory allocation and reference ownership are related but separate systems.
11.18 Allocation Failure
Allocators can fail.
C extension code must check allocation results.
void *p = PyMem_Malloc(size);
if (p == NULL) {
PyErr_NoMemory();
return NULL;
}For object creation functions, NULL usually means an exception has been set or must be set.
Correct pattern:
PyObject *obj = PyLong_FromLong(42);
if (obj == NULL) {
return NULL;
}Never assume allocation succeeds. Python code can run under memory pressure, embedded environments, constrained containers, or fuzzing tests.
11.19 Reallocation
PyMem_Realloc changes the size of a memory block.
Pattern:
char *new_data = PyMem_Realloc(self->data, new_size);
if (new_data == NULL) {
PyErr_NoMemory();
return -1;
}
self->data = new_data;
self->size = new_size;Do not overwrite the original pointer before checking success.
Incorrect:
self->data = PyMem_Realloc(self->data, new_size);
if (self->data == NULL) {
return -1; /* old pointer lost */
}If reallocation fails, the original allocation remains valid. Losing that pointer leaks memory.
11.20 Over-Allocation
Some containers over-allocate to avoid reallocating on every append.
Lists are the standard example.
xs = []
for i in range(100):
xs.append(i)The list does not allocate exactly one new slot for each append. It grows capacity in larger steps.
Conceptually:
length = number of used entries
allocated = number of available slotsWhen length reaches allocated capacity, CPython grows the item array.
This gives amortized efficient append behavior.
Tradeoff:
fewer reallocations
some unused spare capacity11.21 Shrinking Containers
Containers may not immediately return memory when they shrink.
Example:
xs = list(range(1_000_000))
del xs[:900_000]The logical length decreases. The internal capacity may shrink only under certain conditions.
This avoids costly reallocations when a list shrinks and grows repeatedly.
For memory-sensitive code, creating a new compact container can sometimes help:
xs = xs[:]or:
xs = list(xs)But measure first. Copying can be expensive.
11.22 Memory Views and Buffers
Some objects expose memory through the buffer protocol.
Examples:
bytes
bytearray
array.array
memoryview
mmap objects
NumPy arrays
some extension objectsA buffer exporter may expose raw memory to another object. That creates lifetime constraints.
Example:
b = bytearray(b"hello")
v = memoryview(b)While a memoryview exists, resizing the underlying bytearray may be restricted.
At the C level, buffer exporters must ensure memory stays valid while consumers hold buffer views.
This is allocator-related because memory cannot be freed or moved while an external view depends on it.
11.23 Non-Moving Allocator Consequences
CPython’s object pointers are stable. Objects are generally not moved by a compacting garbage collector.
Consequences:
PyObject * pointers remain valid while references are owned
C extensions can store object pointers
id(obj) can be address-like in CPython
memory cannot be compacted by moving live objects
fragmentation can accumulateThe non-moving design is central to C API compatibility.
It also explains why CPython uses arenas, pools, free lists, and careful allocator layering rather than a compacting heap.
11.24 Allocator Customization
CPython allows embedders and specialized environments to customize allocators.
This is useful for:
embedding Python in another application
sandboxing
memory accounting
debugging
custom allocation strategies
instrumentation
constrained runtimesAllocator customization must happen carefully and usually early in runtime initialization.
The replacement allocator must obey CPython’s expectations for each allocator domain.
Bad allocator hooks can corrupt the interpreter.
11.25 Memory Accounting Is Hard
Understanding Python memory usage is difficult because several layers interact.
A single Python object may involve:
object header
object payload
auxiliary arrays
referenced objects
allocator padding
pool overhead
arena overhead
free-list retention
system allocator metadata
native library allocationsExample:
xs = ["abc" for _ in range(1000)]Memory includes:
list object
list item array
1000 references in the array
string objects
string character data
allocator overhead
possibly interned or reused objectssys.getsizeof(xs) reports only the size of the list object and its immediate storage, not the full transitive graph.
11.26 sys.getsizeof
sys.getsizeof returns the size of one object as reported by that object.
import sys
xs = [1, 2, 3]
print(sys.getsizeof(xs))It does not recursively include objects referenced by the object.
Example:
import sys
xs = [[1], [2], [3]]
print(sys.getsizeof(xs))This includes the outer list’s storage, not the inner lists and their contents.
A recursive size function must traverse references carefully and avoid double-counting shared objects.
11.27 Common Allocation Patterns
Common CPython allocation patterns include:
| Pattern | Example | Allocation behavior |
|---|---|---|
| Many small tuples | parser, AST work, loops | small-object allocator and free lists matter |
| Large bytes objects | I/O, serialization | may bypass small-object allocator |
| Growing lists | append-heavy code | over-allocation matters |
| Large dicts | indexing, JSON, globals | hash table growth matters |
| Frames | function calls | frame allocation and reuse matter |
| Exceptions | error-heavy paths | traceback and frame retention matter |
| Strings | identifiers, parsing | Unicode layout and interning matter |
Performance work often starts by finding which pattern dominates.
11.28 C Extension Allocation Rules
Practical rules for extension authors:
| Situation | Rule |
|---|---|
| Allocating Python object instance | Use type allocation machinery |
| Freeing Python object instance | Use tp_free from deallocator |
| Allocating auxiliary runtime memory | Use PyMem_* or documented family |
| Allocating object memory manually | Use PyObject_* only when appropriate |
| Returning Python object | Return owned reference |
| Storing Python object field | Own a reference |
| Reallocating memory | Keep old pointer until success |
| Handling allocation failure | Set or propagate MemoryError |
| Mixing allocators | Do not do it |
Allocator bugs are often severe. They can appear as crashes far away from the actual mistake.
11.29 Mental Model
Use this model:
Python object allocation
type object chooses allocation path
object memory contains common header
object-specific fields are initialized
references are owned explicitly
deallocator releases references
tp_free releases memory
Small-object allocation
arenas contain pools
pools contain fixed-size blocks
small requests are served quickly
memory may be retained for reuseMemory management in CPython is a layered system:
reference counting decides when an object dies
cyclic GC finds unreachable cycles
deallocator releases object-owned resources
allocator reuses or frees memory blocks
system allocator manages process heap pages
operating system manages virtual memoryEach layer answers a different question.
11.30 Summary
CPython’s memory allocator system supports fast allocation for the object-heavy workload of Python programs. Small objects are commonly served through pymalloc, which organizes memory into arenas, pools, and blocks. Type-specific free lists and object reuse further reduce allocation cost for common objects.
For C extension authors, the critical rules are allocator family discipline, correct handling of allocation failure, safe reallocation, and clear separation between memory ownership and reference ownership.
Memory freed at the Python object level may stay reserved inside CPython or the platform allocator. High RSS after object deletion does not automatically mean a leak. CPython often keeps memory available for reuse.