Skip to content

13. Built-in Object Implementations

C-level implementations of None, True, False, NotImplemented, and Ellipsis as statically allocated objects.

Built-in objects are the concrete data structures behind Python’s core types. They are ordinary Python objects in the sense that they have identities, types, reference counts, attributes where supported, and behavior defined by type slots. They are special because their storage and operations are implemented directly in C.

This chapter gives a broad map. Later chapters go deeper into strings, lists, tuples, dictionaries, sets, numbers, functions, modules, and frames.

13.1 Built-ins Are Type Objects

A built-in type such as list, dict, or int is itself a Python object.

print(type(list))     # <class 'type'>
print(type(dict))     # <class 'type'>
print(type(int))      # <class 'type'>

An instance points to its type object.

xs = [1, 2, 3]
print(type(xs))       # <class 'list'>

At the C level:

xs ---> PyListObject
          ob_refcnt
          ob_type ----> PyList_Type
          ob_size
          ob_item
          allocated

The instance stores state. The type object stores behavior.

13.2 Why Built-ins Are Implemented in C

Built-in types sit on the hottest paths of Python execution.

Common operations include:

integer arithmetic
string hashing
attribute lookup
dictionary lookup
list append
tuple creation
function calls
iteration
exception creation

If these operations were implemented as ordinary Python code, the interpreter would have to execute more bytecode to perform its own basic operations. CPython avoids this by implementing core types in C.

For example, a Python dictionary is used for:

module globals
class namespaces
object attributes
keyword arguments
import caches
annotations
many user data structures

A slow dictionary would make the whole interpreter slow.

13.3 Common Built-in Object Pattern

Most built-in object implementations follow this shape:

typedef struct {
    PyObject_HEAD
    /* type-specific fields */
} SomeObject;

or, for variable-size objects:

typedef struct {
    PyObject_VAR_HEAD
    /* type-specific fields */
} SomeVarObject;

The type object then provides slots:

PyTypeObject Some_Type = {
    .tp_name = "...",
    .tp_basicsize = sizeof(SomeObject),
    .tp_dealloc = ...,
    .tp_repr = ...,
    .tp_as_number = ...,
    .tp_as_sequence = ...,
    .tp_as_mapping = ...,
    .tp_methods = ...,
};

This pattern repeats across CPython.

13.4 Object Families

The main built-in object families are:

FamilyExamplesMain role
Numeric objectsint, float, complex, boolArithmetic and numeric protocols
Text and binary objectsstr, bytes, bytearray, memoryviewText, binary data, buffers
Sequence objectslist, tuple, rangeOrdered collections
Mapping objectsdict, mappingproxyKey-value storage
Set objectsset, frozensetHash-based membership
Callable objectsfunction, method, builtin functionInvocation
Runtime objectsmodule, frame, code, tracebackExecution machinery
Descriptor objectsproperty, getset, member, method descriptorAttribute behavior
Iterator objectslist iterator, dict iterator, generatorIteration
Exception objectsBaseException and subclassesError propagation

Each family uses the same object model but optimizes for different operations.

13.5 Integer Objects

Python int values are arbitrary precision.

x = 10**100
print(x)

CPython stores integers as PyLongObject, a variable-size object. It does not use one fixed machine integer for all Python integers.

Conceptually:

PyLongObject
    PyVarObject header
        ob_size = number of internal digits, with sign encoded
    digits[]

Small integers use few internal digits. Large integers use many digits.

This explains why Python integers do not overflow like C long in ordinary arithmetic.

x = 2**1000
print(x * x)

The cost grows with integer size. Operations on small integers are fast. Operations on very large integers require multi-precision arithmetic.

13.6 Boolean Objects

bool is a subclass of int.

print(isinstance(True, int))    # True
print(True + True)              # 2

There are exactly two boolean singleton objects:

True
False

At the C level, these are runtime-owned objects. Code should compare boolean values by truth value at the Python level, not by constructing new boolean instances.

if condition:
    ...

C code usually returns booleans with macros such as:

Py_RETURN_TRUE;
Py_RETURN_FALSE;

13.7 Floating-Point Objects

Python float is usually implemented as a C double.

Conceptually:

PyFloatObject
    PyObject header
    double value

A float object is fixed-size.

x = 1.5
y = 2.25
print(x + y)

Float arithmetic follows the platform’s floating-point behavior, generally IEEE 754 double precision on common systems.

A float stores approximate binary real numbers. It does not represent decimal fractions exactly.

print(0.1 + 0.2)

The surprising result comes from binary floating-point representation, not from Python-specific arithmetic.

13.8 Complex Objects

Python complex stores two floating-point values:

PyComplexObject
    PyObject header
    real double
    imag double

Example:

z = 1.5 + 2.0j
print(z.real)
print(z.imag)

Complex numbers participate in numeric slots. They support arithmetic, but they do not support ordering comparisons such as <.

1 + 2j < 3 + 4j     # TypeError

13.9 String Objects

Python str stores Unicode text.

A string is immutable.

s = "hello"
t = s.upper()

upper() creates another string. It does not mutate s.

CPython’s Unicode implementation is optimized for compact storage. The internal representation can use different element widths depending on the largest code point in the string.

Conceptually:

PyUnicodeObject
    object header
    length
    hash cache
    kind
    compact/ascii flags
    character data

Important string optimizations include:

cached hash value
compact layout
ASCII fast path
interning for selected strings
specialized Unicode operations

Strings are central because they are used for identifiers, attribute names, dictionary keys, source code, file paths, protocol data, and user text.

13.10 Bytes and Bytearray

bytes is immutable binary data.

b = b"hello"

bytearray is mutable binary data.

buf = bytearray(b"hello")
buf[0] = ord("H")

Conceptual difference:

TypeMutableMain use
bytesNoImmutable binary data
bytearrayYesMutable binary buffer

Both are sequence-like objects over integers in the range 0 to 255.

b = b"abc"
print(b[0])      # 97

13.11 List Objects

A list is a mutable sequence.

xs = [1, 2, 3]
xs.append(4)

A CPython list object stores a pointer to a separately allocated array of object references.

Conceptually:

PyListObject
    PyVarObject header
        ob_size = logical length
    ob_item ----> array of PyObject *
    allocated = capacity

The array stores references, not inline object data.

list
    ob_item[0] ---> int object 1
    ob_item[1] ---> int object 2
    ob_item[2] ---> int object 3

Lists over-allocate when growing. This makes repeated append efficient on average.

13.12 Tuple Objects

A tuple is an immutable sequence.

t = (1, 2, 3)

A tuple stores item references inline in the tuple allocation.

Conceptually:

PyTupleObject
    PyVarObject header
        ob_size = length
    ob_item[0]
    ob_item[1]
    ob_item[2]

A tuple cannot change length after creation. This makes inline storage practical.

Tuple immutability refers to the tuple’s references, not necessarily the deep mutability of contained objects.

t = ([],)
t[0].append(1)
print(t)         # ([1],)

The tuple still points to the same list. The list changed.

13.13 Dict Objects

A dictionary is a hash table mapping keys to values.

d = {"name": "Ada", "age": 36}

Dictionaries are used throughout CPython, not only in user code.

They store:

module globals
class namespaces
instance attributes
keyword arguments
import caches

A dict lookup roughly needs:

hash the key
find a matching table slot
compare keys if needed
return associated value

Important properties:

average O(1) lookup
insertion order preservation
hash-based key storage
resize when table becomes too full
specialized layouts for object attributes

Dictionaries are among the most performance-sensitive objects in CPython.

13.14 Set and Frozenset Objects

A set is a hash table of keys without values.

seen = set()
seen.add("x")

A frozenset is immutable.

s = frozenset(["a", "b"])

Sets are optimized for membership tests:

if item in seen:
    ...

The internal structure is similar in spirit to dict, but stores only elements.

Set operations include:

union
intersection
difference
symmetric difference
subset testing
membership testing

frozenset is hashable if all elements are hashable, so it can be used as a dictionary key or set element.

13.15 Range Objects

A range represents an arithmetic progression without storing every element.

r = range(0, 1_000_000, 2)

Conceptually:

range object
    start
    stop
    step
    length

The object is compact even for huge ranges.

import sys

print(sys.getsizeof(range(10)))
print(sys.getsizeof(range(10**12)))

Both are small because the sequence is computed lazily.

13.16 Function Objects

A Python function object wraps executable code and runtime context.

def add(a, b):
    return a + b

A function object contains references to:

code object
globals dictionary
defaults
keyword defaults
closure cells
annotations
qualified name
module name

Conceptually:

PyFunctionObject
    code
    globals
    defaults
    kwdefaults
    closure
    annotations
    name
    qualname

The code object contains bytecode. The function object provides the environment needed to execute that bytecode.

13.17 Code Objects

A code object is compiled executable metadata.

It contains:

bytecode
constants
names
local variable names
free variables
cell variables
stack size
flags
line table
exception table
filename
function name

Example:

def f(x):
    return x + 1

code = f.__code__
print(code.co_consts)
print(code.co_varnames)

Code objects are immutable. They can be shared by multiple function objects.

13.18 Module Objects

A module object represents an imported module.

import math
print(math)

A module mostly contains a namespace dictionary.

Conceptually:

module object
    name
    dict
    spec
    loader
    package
    file

The module dictionary stores global variables defined by the module.

import math
print(math.__dict__["sqrt"])

Importing a module creates or retrieves a module object and stores it in sys.modules.

13.19 Class and Instance Objects

Classes are type objects. Instances are objects whose type pointer points to the class.

class User:
    pass

u = User()

Conceptually:

User
    type object
    attributes and methods
    base classes
    MRO

u
    instance object
    ob_type ---> User
    instance dictionary or slots

Ordinary instances usually store attributes in a dictionary.

u.name = "Ada"

With __slots__, instances can store selected fields in fixed offsets instead of a dictionary.

class Point:
    __slots__ = ("x", "y")

This reduces memory per instance and can speed some attribute access patterns.

13.20 Method Objects

When a function is accessed through an instance, Python creates a bound method object.

class C:
    def f(self):
        return 1

c = C()
m = c.f

The bound method stores:

function
self object

Conceptually:

bound method
    __func__ ---> C.f
    __self__ ---> c

Calling m() passes c as the first argument.

This is descriptor behavior. The function object’s descriptor slot performs the binding.

13.21 Built-in Function and Method Objects

Some callables are implemented directly in C.

Examples:

len
print
dict.get
list.append

These are built-in function or method objects. They wrap C function pointers and metadata.

They are faster than equivalent Python-level functions because they avoid executing Python bytecode for the operation itself.

A built-in method such as list.append still receives Python objects and follows reference ownership rules internally.

13.22 Iterator Objects

Iterator objects implement __iter__ and __next__.

it = iter([1, 2, 3])
print(next(it))

A list iterator stores:

reference to list
current index

A dict iterator stores:

reference to dict
iteration position
version or mutation state

Generators are also iterators, but they are more complex because they contain suspended execution frames.

13.23 Generator Objects

A generator object represents a suspended function execution.

def count():
    yield 1
    yield 2

Calling count() does not run the function body immediately. It creates a generator object.

g = count()

The generator stores execution state:

code or frame state
instruction position
locals
evaluation stack
exception state
closed/running state

Each next(g) resumes execution until the next yield or return.

Generators connect the object model with the interpreter frame model.

13.24 Frame Objects

A frame object represents an executing or suspended block of code.

Frames contain:

code object
globals
builtins
locals
value stack
instruction pointer
exception state
previous frame link where exposed

Frames are created for function calls, module execution, class body execution, generators, coroutines, and tracebacks.

Frame objects are important for:

debuggers
profilers
trace functions
exceptions
inspect module
generators and coroutines

They are also expensive enough that CPython has worked to avoid materializing full Python-visible frame objects unless needed in some paths.

13.25 Traceback Objects

A traceback object records where an exception propagated.

try:
    1 / 0
except ZeroDivisionError as exc:
    tb = exc.__traceback__

A traceback links to:

frame
line number or instruction position
next traceback

Tracebacks can retain frames. Frames can retain locals. This means exceptions can keep large object graphs alive.

This is a common memory retention pattern in long-running programs.

13.26 Exception Objects

Exceptions are ordinary objects derived from BaseException.

try:
    raise ValueError("bad")
except ValueError as exc:
    print(exc.args)

An exception object can store:

args
message data
__cause__
__context__
__traceback__
notes
custom attributes

Exception classes are normal classes, but exception propagation is deeply integrated into the interpreter.

13.27 Descriptor Objects

Descriptors control attribute access.

Built-in descriptor object kinds include:

function descriptors
method descriptors
member descriptors
getset descriptors
wrapper descriptors
property objects
classmethod objects
staticmethod objects

A descriptor defines one or more of:

__get__
__set__
__delete__

Descriptors implement:

methods
properties
slots
C-level members
C-level computed attributes
special method wrappers

Without descriptors, Python’s method binding and attribute model would be much less flexible.

13.28 Memoryview Objects

A memoryview exposes another object’s buffer without copying.

b = bytearray(b"hello")
v = memoryview(b)

The memoryview keeps the exported buffer alive and lets code read or write memory depending on mutability.

This is essential for zero-copy operations across bytes-like objects and extension modules.

The memoryview object participates in buffer lifetime rules. The exporter must not free or resize memory in a way that invalidates active views.

13.29 Capsule Objects

A capsule wraps a C pointer for safe exchange through Python APIs.

C extensions use capsules to expose native pointers without making them normal Python objects.

Conceptually:

capsule
    void *pointer
    name
    destructor
    context

Capsules are useful for C extension interoperability. They allow one extension module to publish a C API that another extension can import.

13.30 Object Implementation Tradeoffs

Built-in object implementations balance several pressures:

PressureEffect
SpeedSpecialized C paths for hot operations
Memory useCompact layouts, sharing, interning, free lists
CompatibilityStable Python semantics and C API behavior
DebuggabilityRuntime checks, debug builds, introspection hooks
PortabilityAvoid assumptions that break supported platforms
ExtensibilitySlots, protocols, subclassing support
SafetyReference counting, GC traversal, error handling

Many CPython implementation details come from these tradeoffs.

For example, list over-allocation improves append speed but may retain extra memory. Dict insertion order costs memory but gives useful language behavior. Reference counting gives prompt destruction but requires cycle GC and careful C API ownership rules.

13.31 Mental Model

Use this model:

built-in type
    C struct for instance layout
    PyTypeObject for behavior
    slots for protocols
    methods for public operations
    deallocator for owned references
    optional GC traversal for cycles

When reading a built-in type implementation, ask:

What does the object store?
Does it own Python references?
Is it fixed-size or variable-size?
Does it use auxiliary memory?
Does it participate in cyclic GC?
What slots does its type object fill?
What operations are hot paths?
What invariants must always hold?

13.32 Summary

Built-in objects are specialized C implementations of Python’s core runtime values. They all follow the same object model: a common header, a type pointer, type-specific storage, reference ownership rules, and behavior defined by type slots.

Their implementations are optimized because they sit beneath almost every Python program. Lists, tuples, dicts, strings, functions, modules, frames, and exceptions are not just library conveniences. They are the working parts of the interpreter.