# 13. Built-in Object Implementations

# 13. Built-in Object Implementations

Built-in objects are the concrete data structures behind Python’s core types. They are ordinary Python objects in the sense that they have identities, types, reference counts, attributes where supported, and behavior defined by type slots. They are special because their storage and operations are implemented directly in C.

This chapter gives a broad map. Later chapters go deeper into strings, lists, tuples, dictionaries, sets, numbers, functions, modules, and frames.

## 13.1 Built-ins Are Type Objects

A built-in type such as `list`, `dict`, or `int` is itself a Python object.

```python
print(type(list))     # <class 'type'>
print(type(dict))     # <class 'type'>
print(type(int))      # <class 'type'>
```

An instance points to its type object.

```python
xs = [1, 2, 3]
print(type(xs))       # <class 'list'>
```

At the C level:

```text
xs ---> PyListObject
          ob_refcnt
          ob_type ----> PyList_Type
          ob_size
          ob_item
          allocated
```

The instance stores state. The type object stores behavior.

## 13.2 Why Built-ins Are Implemented in C

Built-in types sit on the hottest paths of Python execution.

Common operations include:

```text
integer arithmetic
string hashing
attribute lookup
dictionary lookup
list append
tuple creation
function calls
iteration
exception creation
```

If these operations were implemented as ordinary Python code, the interpreter would have to execute more bytecode to perform its own basic operations. CPython avoids this by implementing core types in C.

For example, a Python dictionary is used for:

```text
module globals
class namespaces
object attributes
keyword arguments
import caches
annotations
many user data structures
```

A slow dictionary would make the whole interpreter slow.

## 13.3 Common Built-in Object Pattern

Most built-in object implementations follow this shape:

```c
typedef struct {
    PyObject_HEAD
    /* type-specific fields */
} SomeObject;
```

or, for variable-size objects:

```c
typedef struct {
    PyObject_VAR_HEAD
    /* type-specific fields */
} SomeVarObject;
```

The type object then provides slots:

```c
PyTypeObject Some_Type = {
    .tp_name = "...",
    .tp_basicsize = sizeof(SomeObject),
    .tp_dealloc = ...,
    .tp_repr = ...,
    .tp_as_number = ...,
    .tp_as_sequence = ...,
    .tp_as_mapping = ...,
    .tp_methods = ...,
};
```

This pattern repeats across CPython.

## 13.4 Object Families

The main built-in object families are:

| Family                  | Examples                                    | Main role                        |
| ----------------------- | ------------------------------------------- | -------------------------------- |
| Numeric objects         | `int`, `float`, `complex`, `bool`           | Arithmetic and numeric protocols |
| Text and binary objects | `str`, `bytes`, `bytearray`, `memoryview`   | Text, binary data, buffers       |
| Sequence objects        | `list`, `tuple`, `range`                    | Ordered collections              |
| Mapping objects         | `dict`, `mappingproxy`                      | Key-value storage                |
| Set objects             | `set`, `frozenset`                          | Hash-based membership            |
| Callable objects        | function, method, builtin function          | Invocation                       |
| Runtime objects         | module, frame, code, traceback              | Execution machinery              |
| Descriptor objects      | property, getset, member, method descriptor | Attribute behavior               |
| Iterator objects        | list iterator, dict iterator, generator     | Iteration                        |
| Exception objects       | `BaseException` and subclasses              | Error propagation                |

Each family uses the same object model but optimizes for different operations.

## 13.5 Integer Objects

Python `int` values are arbitrary precision.

```python
x = 10**100
print(x)
```

CPython stores integers as `PyLongObject`, a variable-size object. It does not use one fixed machine integer for all Python integers.

Conceptually:

```text
PyLongObject
    PyVarObject header
        ob_size = number of internal digits, with sign encoded
    digits[]
```

Small integers use few internal digits. Large integers use many digits.

This explains why Python integers do not overflow like C `long` in ordinary arithmetic.

```python
x = 2**1000
print(x * x)
```

The cost grows with integer size. Operations on small integers are fast. Operations on very large integers require multi-precision arithmetic.

## 13.6 Boolean Objects

`bool` is a subclass of `int`.

```python
print(isinstance(True, int))    # True
print(True + True)              # 2
```

There are exactly two boolean singleton objects:

```python
True
False
```

At the C level, these are runtime-owned objects. Code should compare boolean values by truth value at the Python level, not by constructing new boolean instances.

```python
if condition:
    ...
```

C code usually returns booleans with macros such as:

```c
Py_RETURN_TRUE;
Py_RETURN_FALSE;
```

## 13.7 Floating-Point Objects

Python `float` is usually implemented as a C double.

Conceptually:

```text
PyFloatObject
    PyObject header
    double value
```

A float object is fixed-size.

```python
x = 1.5
y = 2.25
print(x + y)
```

Float arithmetic follows the platform’s floating-point behavior, generally IEEE 754 double precision on common systems.

A float stores approximate binary real numbers. It does not represent decimal fractions exactly.

```python
print(0.1 + 0.2)
```

The surprising result comes from binary floating-point representation, not from Python-specific arithmetic.

## 13.8 Complex Objects

Python `complex` stores two floating-point values:

```text
PyComplexObject
    PyObject header
    real double
    imag double
```

Example:

```python
z = 1.5 + 2.0j
print(z.real)
print(z.imag)
```

Complex numbers participate in numeric slots. They support arithmetic, but they do not support ordering comparisons such as `<`.

```python
1 + 2j < 3 + 4j     # TypeError
```

## 13.9 String Objects

Python `str` stores Unicode text.

A string is immutable.

```python
s = "hello"
t = s.upper()
```

`upper()` creates another string. It does not mutate `s`.

CPython’s Unicode implementation is optimized for compact storage. The internal representation can use different element widths depending on the largest code point in the string.

Conceptually:

```text
PyUnicodeObject
    object header
    length
    hash cache
    kind
    compact/ascii flags
    character data
```

Important string optimizations include:

```text
cached hash value
compact layout
ASCII fast path
interning for selected strings
specialized Unicode operations
```

Strings are central because they are used for identifiers, attribute names, dictionary keys, source code, file paths, protocol data, and user text.

## 13.10 Bytes and Bytearray

`bytes` is immutable binary data.

```python
b = b"hello"
```

`bytearray` is mutable binary data.

```python
buf = bytearray(b"hello")
buf[0] = ord("H")
```

Conceptual difference:

| Type        | Mutable | Main use              |
| ----------- | ------: | --------------------- |
| `bytes`     |      No | Immutable binary data |
| `bytearray` |     Yes | Mutable binary buffer |

Both are sequence-like objects over integers in the range 0 to 255.

```python
b = b"abc"
print(b[0])      # 97
```

## 13.11 List Objects

A list is a mutable sequence.

```python
xs = [1, 2, 3]
xs.append(4)
```

A CPython list object stores a pointer to a separately allocated array of object references.

Conceptually:

```text
PyListObject
    PyVarObject header
        ob_size = logical length
    ob_item ----> array of PyObject *
    allocated = capacity
```

The array stores references, not inline object data.

```text
list
    ob_item[0] ---> int object 1
    ob_item[1] ---> int object 2
    ob_item[2] ---> int object 3
```

Lists over-allocate when growing. This makes repeated `append` efficient on average.

## 13.12 Tuple Objects

A tuple is an immutable sequence.

```python
t = (1, 2, 3)
```

A tuple stores item references inline in the tuple allocation.

Conceptually:

```text
PyTupleObject
    PyVarObject header
        ob_size = length
    ob_item[0]
    ob_item[1]
    ob_item[2]
```

A tuple cannot change length after creation. This makes inline storage practical.

Tuple immutability refers to the tuple’s references, not necessarily the deep mutability of contained objects.

```python
t = ([],)
t[0].append(1)
print(t)         # ([1],)
```

The tuple still points to the same list. The list changed.

## 13.13 Dict Objects

A dictionary is a hash table mapping keys to values.

```python
d = {"name": "Ada", "age": 36}
```

Dictionaries are used throughout CPython, not only in user code.

They store:

```text
module globals
class namespaces
instance attributes
keyword arguments
import caches
```

A dict lookup roughly needs:

```text
hash the key
find a matching table slot
compare keys if needed
return associated value
```

Important properties:

```text
average O(1) lookup
insertion order preservation
hash-based key storage
resize when table becomes too full
specialized layouts for object attributes
```

Dictionaries are among the most performance-sensitive objects in CPython.

## 13.14 Set and Frozenset Objects

A set is a hash table of keys without values.

```python
seen = set()
seen.add("x")
```

A frozenset is immutable.

```python
s = frozenset(["a", "b"])
```

Sets are optimized for membership tests:

```python
if item in seen:
    ...
```

The internal structure is similar in spirit to dict, but stores only elements.

Set operations include:

```text
union
intersection
difference
symmetric difference
subset testing
membership testing
```

`frozenset` is hashable if all elements are hashable, so it can be used as a dictionary key or set element.

## 13.15 Range Objects

A `range` represents an arithmetic progression without storing every element.

```python
r = range(0, 1_000_000, 2)
```

Conceptually:

```text
range object
    start
    stop
    step
    length
```

The object is compact even for huge ranges.

```python
import sys

print(sys.getsizeof(range(10)))
print(sys.getsizeof(range(10**12)))
```

Both are small because the sequence is computed lazily.

## 13.16 Function Objects

A Python function object wraps executable code and runtime context.

```python
def add(a, b):
    return a + b
```

A function object contains references to:

```text
code object
globals dictionary
defaults
keyword defaults
closure cells
annotations
qualified name
module name
```

Conceptually:

```text
PyFunctionObject
    code
    globals
    defaults
    kwdefaults
    closure
    annotations
    name
    qualname
```

The code object contains bytecode. The function object provides the environment needed to execute that bytecode.

## 13.17 Code Objects

A code object is compiled executable metadata.

It contains:

```text
bytecode
constants
names
local variable names
free variables
cell variables
stack size
flags
line table
exception table
filename
function name
```

Example:

```python
def f(x):
    return x + 1

code = f.__code__
print(code.co_consts)
print(code.co_varnames)
```

Code objects are immutable. They can be shared by multiple function objects.

## 13.18 Module Objects

A module object represents an imported module.

```python
import math
print(math)
```

A module mostly contains a namespace dictionary.

Conceptually:

```text
module object
    name
    dict
    spec
    loader
    package
    file
```

The module dictionary stores global variables defined by the module.

```python
import math
print(math.__dict__["sqrt"])
```

Importing a module creates or retrieves a module object and stores it in `sys.modules`.

## 13.19 Class and Instance Objects

Classes are type objects. Instances are objects whose type pointer points to the class.

```python
class User:
    pass

u = User()
```

Conceptually:

```text
User
    type object
    attributes and methods
    base classes
    MRO

u
    instance object
    ob_type ---> User
    instance dictionary or slots
```

Ordinary instances usually store attributes in a dictionary.

```python
u.name = "Ada"
```

With `__slots__`, instances can store selected fields in fixed offsets instead of a dictionary.

```python
class Point:
    __slots__ = ("x", "y")
```

This reduces memory per instance and can speed some attribute access patterns.

## 13.20 Method Objects

When a function is accessed through an instance, Python creates a bound method object.

```python
class C:
    def f(self):
        return 1

c = C()
m = c.f
```

The bound method stores:

```text
function
self object
```

Conceptually:

```text
bound method
    __func__ ---> C.f
    __self__ ---> c
```

Calling `m()` passes `c` as the first argument.

This is descriptor behavior. The function object’s descriptor slot performs the binding.

## 13.21 Built-in Function and Method Objects

Some callables are implemented directly in C.

Examples:

```python
len
print
dict.get
list.append
```

These are built-in function or method objects. They wrap C function pointers and metadata.

They are faster than equivalent Python-level functions because they avoid executing Python bytecode for the operation itself.

A built-in method such as `list.append` still receives Python objects and follows reference ownership rules internally.

## 13.22 Iterator Objects

Iterator objects implement `__iter__` and `__next__`.

```python
it = iter([1, 2, 3])
print(next(it))
```

A list iterator stores:

```text
reference to list
current index
```

A dict iterator stores:

```text
reference to dict
iteration position
version or mutation state
```

Generators are also iterators, but they are more complex because they contain suspended execution frames.

## 13.23 Generator Objects

A generator object represents a suspended function execution.

```python
def count():
    yield 1
    yield 2
```

Calling `count()` does not run the function body immediately. It creates a generator object.

```python
g = count()
```

The generator stores execution state:

```text
code or frame state
instruction position
locals
evaluation stack
exception state
closed/running state
```

Each `next(g)` resumes execution until the next `yield` or return.

Generators connect the object model with the interpreter frame model.

## 13.24 Frame Objects

A frame object represents an executing or suspended block of code.

Frames contain:

```text
code object
globals
builtins
locals
value stack
instruction pointer
exception state
previous frame link where exposed
```

Frames are created for function calls, module execution, class body execution, generators, coroutines, and tracebacks.

Frame objects are important for:

```text
debuggers
profilers
trace functions
exceptions
inspect module
generators and coroutines
```

They are also expensive enough that CPython has worked to avoid materializing full Python-visible frame objects unless needed in some paths.

## 13.25 Traceback Objects

A traceback object records where an exception propagated.

```python
try:
    1 / 0
except ZeroDivisionError as exc:
    tb = exc.__traceback__
```

A traceback links to:

```text
frame
line number or instruction position
next traceback
```

Tracebacks can retain frames. Frames can retain locals. This means exceptions can keep large object graphs alive.

This is a common memory retention pattern in long-running programs.

## 13.26 Exception Objects

Exceptions are ordinary objects derived from `BaseException`.

```python
try:
    raise ValueError("bad")
except ValueError as exc:
    print(exc.args)
```

An exception object can store:

```text
args
message data
__cause__
__context__
__traceback__
notes
custom attributes
```

Exception classes are normal classes, but exception propagation is deeply integrated into the interpreter.

## 13.27 Descriptor Objects

Descriptors control attribute access.

Built-in descriptor object kinds include:

```text
function descriptors
method descriptors
member descriptors
getset descriptors
wrapper descriptors
property objects
classmethod objects
staticmethod objects
```

A descriptor defines one or more of:

```python
__get__
__set__
__delete__
```

Descriptors implement:

```text
methods
properties
slots
C-level members
C-level computed attributes
special method wrappers
```

Without descriptors, Python’s method binding and attribute model would be much less flexible.

## 13.28 Memoryview Objects

A `memoryview` exposes another object’s buffer without copying.

```python
b = bytearray(b"hello")
v = memoryview(b)
```

The memoryview keeps the exported buffer alive and lets code read or write memory depending on mutability.

This is essential for zero-copy operations across bytes-like objects and extension modules.

The memoryview object participates in buffer lifetime rules. The exporter must not free or resize memory in a way that invalidates active views.

## 13.29 Capsule Objects

A capsule wraps a C pointer for safe exchange through Python APIs.

C extensions use capsules to expose native pointers without making them normal Python objects.

Conceptually:

```text
capsule
    void *pointer
    name
    destructor
    context
```

Capsules are useful for C extension interoperability. They allow one extension module to publish a C API that another extension can import.

## 13.30 Object Implementation Tradeoffs

Built-in object implementations balance several pressures:

| Pressure      | Effect                                            |
| ------------- | ------------------------------------------------- |
| Speed         | Specialized C paths for hot operations            |
| Memory use    | Compact layouts, sharing, interning, free lists   |
| Compatibility | Stable Python semantics and C API behavior        |
| Debuggability | Runtime checks, debug builds, introspection hooks |
| Portability   | Avoid assumptions that break supported platforms  |
| Extensibility | Slots, protocols, subclassing support             |
| Safety        | Reference counting, GC traversal, error handling  |

Many CPython implementation details come from these tradeoffs.

For example, list over-allocation improves append speed but may retain extra memory. Dict insertion order costs memory but gives useful language behavior. Reference counting gives prompt destruction but requires cycle GC and careful C API ownership rules.

## 13.31 Mental Model

Use this model:

```text
built-in type
    C struct for instance layout
    PyTypeObject for behavior
    slots for protocols
    methods for public operations
    deallocator for owned references
    optional GC traversal for cycles
```

When reading a built-in type implementation, ask:

```text
What does the object store?
Does it own Python references?
Is it fixed-size or variable-size?
Does it use auxiliary memory?
Does it participate in cyclic GC?
What slots does its type object fill?
What operations are hot paths?
What invariants must always hold?
```

## 13.32 Summary

Built-in objects are specialized C implementations of Python’s core runtime values. They all follow the same object model: a common header, a type pointer, type-specific storage, reference ownership rules, and behavior defined by type slots.

Their implementations are optimized because they sit beneath almost every Python program. Lists, tuples, dicts, strings, functions, modules, frames, and exceptions are not just library conveniences. They are the working parts of the interpreter.

