# 8. PyObject and PyVarObject

# 8. `PyObject` and `PyVarObject`

`PyObject` and `PyVarObject` are the base layouts behind CPython objects. They are not Python classes. They are C-level struct conventions that allow the runtime to treat many different object implementations through a common pointer type.

At runtime, most object references in CPython are represented as:

```c
PyObject *
```

This means “pointer to some Python object.” The pointed-to memory may actually be an integer object, list object, dict object, function object, type object, module object, or user-defined instance. The common object header makes this safe.

## 8.1 The Common Object Header

A simplified `PyObject` looks like this:

```c
typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;
```

The real definition uses macros and build-dependent fields, especially in debug builds, tracing builds, and modern CPython versions. But the essential idea is stable:

```text
PyObject
    reference count
    type pointer
```

The reference count tracks ownership.

The type pointer tells CPython how the object behaves.

Every normal CPython object begins with this common header. Because of that, the runtime can receive a `PyObject *` and inspect its type without knowing the full concrete struct at compile time.

## 8.2 Why Every Object Starts the Same Way

Consider this Python code:

```python
x = 42
y = "hello"
z = [1, 2, 3]
```

At the C level, these objects have different layouts.

```text
PyLongObject
    object header
    integer digit data

PyUnicodeObject
    object header
    string metadata
    character storage

PyListObject
    object header
    length
    allocated capacity
    pointer to item array
```

But each one starts with the same header:

```text
+--------------------+
| ob_refcnt          |
+--------------------+
| ob_type            |
+--------------------+
| type-specific data |
+--------------------+
```

This allows generic runtime code to work with all objects.

For example, `Py_INCREF(obj)` only needs the reference count field. It does not need to know whether `obj` is a list, string, dict, or function.

Likewise, `Py_TYPE(obj)` only needs the type pointer field.

## 8.3 `ob_refcnt`

`ob_refcnt` stores the object’s reference count.

Reference counting is CPython’s primary object lifetime mechanism. When code creates, stores, returns, or releases object references, CPython updates this count.

Simplified:

```c
#define Py_INCREF(op) ((op)->ob_refcnt++)
#define Py_DECREF(op)                         \
    do {                                      \
        if (--(op)->ob_refcnt == 0) {         \
            deallocate_object(op);            \
        }                                     \
    } while (0)
```

The real implementation is more complex. It handles immortal objects, debug hooks, tracing, free-threaded builds, and deallocation details.

The conceptual rule is:

```text
new strong reference acquired
    increment reference count

strong reference released
    decrement reference count

reference count reaches zero
    destroy object
```

Example:

```python
x = []
y = x
del x
del y
```

The list object survives while at least one strong reference remains.

## 8.4 `ob_type`

`ob_type` points to the object’s type object.

For this Python code:

```python
x = []
```

the list object’s `ob_type` points to the `list` type object.

Conceptually:

```text
x  --->  PyListObject
            ob_refcnt
            ob_type  ---->  PyList_Type
            ob_size
            ob_item
            allocated
```

The type object describes behavior:

```text
object name
object size
base classes
method table
attribute lookup behavior
call behavior
deallocation behavior
number operations
sequence operations
mapping operations
```

This is how CPython dispatches operations.

When code evaluates:

```python
len(x)
```

CPython checks the type’s length slot.

When code evaluates:

```python
x[0]
```

CPython checks sequence or mapping behavior.

When code evaluates:

```python
x + y
```

CPython checks numeric or sequence concatenation slots depending on the types.

## 8.5 `PyObject_HEAD`

Extension types do not usually write the fields manually. They use macros.

A fixed-size extension object often starts like this:

```c
typedef struct {
    PyObject_HEAD
    long value;
} CounterObject;
```

`PyObject_HEAD` expands to the fields required for the object header.

Conceptually:

```c
typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    long value;
} CounterObject;
```

The macro exists because CPython may change header details depending on build configuration. Extension code should use the macro rather than assuming the exact fields.

## 8.6 Fixed-Size Objects

A fixed-size object has the same C struct size for every instance of that type.

Example shape:

```c
typedef struct {
    PyObject_HEAD
    double value;
} FloatLikeObject;
```

Every instance has room for exactly one `double`.

Many objects are fixed-size at the object struct level:

```text
float
module
function
method
cell
weakref
many iterator objects
many descriptor objects
```

The object may still refer to external or separately allocated data. Fixed-size means the object struct itself has a fixed size, not that the complete logical object has no auxiliary storage.

A function object is fixed-size as a struct, but it points to other objects such as its code object, globals dict, defaults tuple, closure tuple, and annotations.

## 8.7 `PyVarObject`

Variable-size objects use `PyVarObject`.

A simplified layout:

```c
typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size;
} PyVarObject;
```

It extends `PyObject` with an additional field:

```text
ob_size
```

`ob_size` usually stores the logical size of the object.

Examples:

```text
tuple length
bytes length
integer digit count
some internal variable-sized arrays
```

The common shape:

```text
PyVarObject
    ob_refcnt
    ob_type
    ob_size
    variable object payload
```

## 8.8 `PyObject_VAR_HEAD`

Variable-size extension types use:

```c
typedef struct {
    PyObject_VAR_HEAD
    PyObject *items[1];
} SmallArrayObject;
```

Conceptually:

```c
typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    Py_ssize_t ob_size;
    PyObject *items[1];
} SmallArrayObject;
```

The final array is often a flexible or variable-length payload. CPython allocates enough memory for the header plus the desired number of elements.

This pattern is used when the object’s size is fixed after allocation.

Tuples are the canonical example. A tuple’s length does not change after creation, so CPython can allocate one block containing the object header and item references.

## 8.9 What `ob_size` Means

`ob_size` does not mean “number of bytes occupied by this object.”

It means a type-specific size value.

For a tuple, it is the number of elements.

For bytes, it is the number of bytes.

For long integers, it is related to the number of internal base digits and may encode sign.

For a custom type, the meaning depends on the type’s implementation.

This is important. `ob_size` is interpreted by the object’s type-specific code.

```text
same field
different meaning per type
```

The type object knows how to interpret its own instances.

## 8.10 Tuple Layout

A tuple is a good example of a variable-size object.

Conceptually:

```c
typedef struct {
    PyObject_VAR_HEAD
    PyObject *ob_item[1];
} PyTupleObject;
```

A tuple of length 3 is allocated as one object with space for three item references:

```text
PyTupleObject
    ob_refcnt
    ob_type  ---> tuple type
    ob_size = 3
    ob_item[0] ---> object A
    ob_item[1] ---> object B
    ob_item[2] ---> object C
```

The tuple owns references to its items. When the tuple is destroyed, it decrements the reference count of each contained object.

The tuple does not own the objects exclusively. It owns references.

```python
a = []
t = (a,)
```

The tuple stores a reference to the list. The name `a` also stores a reference to the same list.

## 8.11 List Layout

A list is also variable-length at the Python level, but its implementation differs from tuple.

A list object has a fixed-size struct that points to a separately allocated array of item references.

Conceptually:

```c
typedef struct {
    PyObject_VAR_HEAD
    PyObject **ob_item;
    Py_ssize_t allocated;
} PyListObject;
```

Shape:

```text
PyListObject
    ob_refcnt
    ob_type  ---> list type
    ob_size = current length
    ob_item  ----> separately allocated array
    allocated = current capacity
```

For a list:

```python
xs = [10, 20, 30]
```

the memory shape is approximately:

```text
list object
    ob_size = 3
    allocated >= 3
    ob_item ----+
                |
                v
              [ ptr to 10 ][ ptr to 20 ][ ptr to 30 ][ spare capacity... ]
```

This allows efficient append. The list can grow by reallocating the separate item array without moving the list object itself.

Object identity remains stable:

```python
xs = []
before = id(xs)

xs.append(1)
xs.append(2)
xs.append(3)

after = id(xs)

print(before == after)   # True
```

The list’s internal array may move. The list object itself remains the same object.

## 8.12 Why Objects Do Not Move

CPython generally does not move live objects.

A `PyObject *` is a direct pointer. Many parts of CPython and many C extensions may hold that pointer.

If CPython moved an object in memory, it would have to find and update every pointer to it. That would be expensive and incompatible with much C extension code.

So CPython uses a non-moving object model.

Consequences:

```text
object identity can be represented by address in CPython
C extensions can hold PyObject * pointers
objects are not compacted by a moving garbage collector
memory fragmentation must be managed differently
```

This is one reason CPython’s allocator design matters.

## 8.13 Casting Between Object Types

Because every object starts with a common header, CPython can cast concrete object pointers to `PyObject *`.

Example:

```c
PyObject *obj = (PyObject *)some_list;
```

But the reverse cast is only safe after type checking.

```c
if (PyList_Check(obj)) {
    PyListObject *list = (PyListObject *)obj;
}
```

Unsafe casting can corrupt memory or crash the interpreter.

Correct extension code follows this pattern:

```c
static PyObject *
get_size(PyObject *self, PyObject *arg)
{
    if (!PyList_Check(arg)) {
        PyErr_SetString(PyExc_TypeError, "expected list");
        return NULL;
    }

    Py_ssize_t n = PyList_GET_SIZE(arg);
    return PyLong_FromSsize_t(n);
}
```

The `PyList_GET_SIZE` macro assumes its argument is a list. The checked API variant is safer when the type is uncertain.

## 8.14 Checked APIs and Fast Macros

CPython exposes both checked functions and fast macros.

Checked form:

```c
Py_ssize_t n = PyList_Size(obj);
```

Fast macro form:

```c
Py_ssize_t n = PyList_GET_SIZE(obj);
```

The checked function validates the object and reports an error if the input is invalid.

The macro assumes the object is already valid and may directly access fields.

Tradeoff:

| Form             | Safety |  Speed | Use case                         |
| ---------------- | -----: | -----: | -------------------------------- |
| Checked function | Higher |  Lower | Public boundary, uncertain input |
| Fast macro       |  Lower | Higher | Internal code after validation   |

This pattern appears throughout the C API.

## 8.15 Type Object Size Fields

The type object describes instance size.

Important fields include:

```text
tp_basicsize
tp_itemsize
```

`tp_basicsize` is the fixed part of each instance.

`tp_itemsize` is the size of each variable-size item for variable-size objects.

For a fixed-size type:

```text
tp_basicsize = sizeof(MyObject)
tp_itemsize = 0
```

For a variable-size type:

```text
tp_basicsize = base header and fixed fields
tp_itemsize = size of each trailing item
```

Allocation can then compute:

```text
total size = tp_basicsize + n * tp_itemsize
```

This is how CPython can allocate one memory block for objects such as tuples.

## 8.16 Minimal Fixed-Size Extension Object

A minimal fixed-size object layout:

```c
typedef struct {
    PyObject_HEAD
    long value;
} CounterObject;
```

A minimal type object sketch:

```c
static PyTypeObject CounterType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "example.Counter",
    .tp_basicsize = sizeof(CounterObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
};
```

The important point is structural.

```text
CounterObject starts with PyObject header.
CounterType says how large CounterObject is.
CPython allocates memory according to CounterType.
CPython treats the result as PyObject * at generic boundaries.
```

## 8.17 Minimal Variable-Size Extension Object

A variable-size object layout might look like:

```c
typedef struct {
    PyObject_VAR_HEAD
    PyObject *items[1];
} FixedArrayObject;
```

The type object would use a nonzero item size:

```c
static PyTypeObject FixedArrayType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "example.FixedArray",
    .tp_basicsize = offsetof(FixedArrayObject, items),
    .tp_itemsize = sizeof(PyObject *),
    .tp_flags = Py_TPFLAGS_DEFAULT,
};
```

Allocation would request a specific logical length.

Conceptually:

```text
allocate FixedArray with n items
    total bytes = tp_basicsize + n * tp_itemsize
    ob_size = n
```

This layout is useful when the number of contained references is known at creation time and does not change afterward.

## 8.18 Object Header Macros

Common macros include:

```c
Py_REFCNT(obj)
Py_TYPE(obj)
Py_SIZE(obj)
```

Their conceptual meanings:

```text
Py_REFCNT(obj)
    get reference count

Py_TYPE(obj)
    get type pointer

Py_SIZE(obj)
    get variable-size field
```

For example:

```c
PyTypeObject *type = Py_TYPE(obj);
```

and:

```c
Py_ssize_t n = Py_SIZE(tuple_obj);
```

Extension code should prefer official macros and functions over direct field access. This makes code more compatible with CPython changes.

## 8.19 Reference Ownership and Headers

The object header stores the count. It does not explain ownership by itself.

Ownership is a convention enforced by API rules.

A function returning a new reference transfers ownership to the caller:

```c
PyObject *x = PyLong_FromLong(42);
/* caller owns x */
Py_DECREF(x);
```

A function returning a borrowed reference does not transfer ownership:

```c
PyObject *item = PyList_GetItem(list, 0);
/* borrowed reference, do not DECREF unless INCREF first */
```

The same object header is involved in both cases. The difference is the contract of the API call.

This is why CPython extension programming is difficult. The memory layout is simple, but the ownership rules require discipline.

## 8.20 Object Initialization

Allocation and initialization are separate.

For a type object, allocation is usually handled by:

```text
tp_alloc
```

Object construction may involve:

```text
tp_new
tp_init
```

At Python level:

```python
obj = MyClass(...)
```

roughly means:

```text
call type object
    call tp_new to allocate or return object
    call tp_init to initialize object
    return object
```

For immutable objects, `tp_new` often does most of the work because the value must be established before the object is exposed.

For mutable objects, `tp_init` can fill in state after allocation.

## 8.21 Deallocation

When an object’s reference count reaches zero, CPython calls the type-specific deallocator.

The type object stores this in:

```text
tp_dealloc
```

A deallocator usually performs these steps:

```text
release references owned by the object
free auxiliary buffers
untrack from cyclic GC if needed
free object memory
```

For a container, the deallocator must decrement references to contained objects.

Example shape:

```c
static void
Counter_dealloc(CounterObject *self)
{
    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

For a container:

```c
static void
Array_dealloc(ArrayObject *self)
{
    for (Py_ssize_t i = 0; i < Py_SIZE(self); i++) {
        Py_XDECREF(self->items[i]);
    }

    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

This is simplified. Real code must handle garbage collector tracking and error-safe invariants.

## 8.22 Garbage Collector Header

Objects that participate in cyclic garbage collection may have an additional GC header before the visible `PyObject` header.

Conceptually:

```text
GC header
PyObject header
type-specific payload
```

The `PyObject *` points to the object header, not to the GC header.

```text
memory block start
    GC metadata
    ob_refcnt       <--- PyObject * points here
    ob_type
    payload
```

The GC header links the object into collector structures.

Only container-like objects that can participate in cycles usually need this tracking.

An extension type that owns references to other Python objects and can participate in cycles must implement the GC protocol correctly.

## 8.23 Debug Builds

Debug builds may add extra fields or checks around objects.

This can include:

```text
reference count debugging
allocator padding
forbidden bytes around memory blocks
extra assertions
API misuse detection
```

For this reason, extension code should avoid assuming exact raw memory layout beyond documented macros.

Bad style:

```c
obj->ob_refcnt++;
```

Better style:

```c
Py_INCREF(obj);
```

Bad style:

```c
obj->ob_type
```

Better style:

```c
Py_TYPE(obj)
```

The macros are the compatibility layer between extension code and CPython internals.

## 8.24 Immortal Objects

Modern CPython has the concept of immortal objects for selected long-lived objects. An immortal object uses a special reference count value and avoids normal reference count destruction.

Typical candidates include fundamental singleton-like or runtime-owned objects.

The practical point for internals readers:

```text
not every reference count behaves like an ordinary small integer
not every INCREF or DECREF has the same runtime effect
never write code that depends on exact refcount arithmetic unless working inside CPython internals
```

At the Python level, reference count inspection is already implementation-specific. At the C level, extension authors should use the official reference management APIs.

## 8.25 Why `PyObject` Matters

`PyObject` is the common currency of the CPython runtime.

The interpreter loop pushes and pops `PyObject *` values.

Function calls pass `PyObject *` arguments.

Containers store `PyObject *` references.

C extension APIs receive and return `PyObject *`.

Type slots operate on `PyObject *`.

Errors are represented by Python exception objects.

Modules are Python objects.

Classes are Python objects.

Functions are Python objects.

Code objects are Python objects.

This uniformity is what makes Python dynamic. The runtime can manipulate arbitrary values through one representation while dispatching behavior through type objects.

## 8.26 Useful Mental Model

When reading CPython C code, assume this shape first:

```text
PyObject *
    points to object header
        ob_refcnt
        ob_type
    followed by type-specific memory
```

Then ask:

```text
What concrete type is this object expected to be?
Is it fixed-size or variable-size?
Who owns this reference?
Can this object participate in reference cycles?
What type slot handles this operation?
Does this API return a new reference or borrowed reference?
```

These questions prevent most early confusion when reading CPython internals.

## 8.27 Summary

`PyObject` is the base header for CPython objects. It gives each object a reference count and a type pointer.

`PyVarObject` extends that header with a size field used by variable-size objects.

Fixed-size objects use `PyObject_HEAD`.

Variable-size objects use `PyObject_VAR_HEAD`.

The object header makes CPython’s dynamic object system possible. It allows generic runtime code to handle many concrete object layouts through `PyObject *`, while type objects provide the behavior needed for calls, arithmetic, indexing, attributes, deallocation, and protocol dispatch.

