Skip to content

8. PyObject and PyVarObject

The PyObject and PyVarObject C structs: ob_refcnt, ob_type, and the ob_size extension for variable-length objects.

PyObject and PyVarObject are the base layouts behind CPython objects. They are not Python classes. They are C-level struct conventions that allow the runtime to treat many different object implementations through a common pointer type.

At runtime, most object references in CPython are represented as:

PyObject *

This means “pointer to some Python object.” The pointed-to memory may actually be an integer object, list object, dict object, function object, type object, module object, or user-defined instance. The common object header makes this safe.

8.1 The Common Object Header

A simplified PyObject looks like this:

typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;

The real definition uses macros and build-dependent fields, especially in debug builds, tracing builds, and modern CPython versions. But the essential idea is stable:

PyObject
    reference count
    type pointer

The reference count tracks ownership.

The type pointer tells CPython how the object behaves.

Every normal CPython object begins with this common header. Because of that, the runtime can receive a PyObject * and inspect its type without knowing the full concrete struct at compile time.

8.2 Why Every Object Starts the Same Way

Consider this Python code:

x = 42
y = "hello"
z = [1, 2, 3]

At the C level, these objects have different layouts.

PyLongObject
    object header
    integer digit data

PyUnicodeObject
    object header
    string metadata
    character storage

PyListObject
    object header
    length
    allocated capacity
    pointer to item array

But each one starts with the same header:

+--------------------+
| ob_refcnt          |
+--------------------+
| ob_type            |
+--------------------+
| type-specific data |
+--------------------+

This allows generic runtime code to work with all objects.

For example, Py_INCREF(obj) only needs the reference count field. It does not need to know whether obj is a list, string, dict, or function.

Likewise, Py_TYPE(obj) only needs the type pointer field.

8.3 ob_refcnt

ob_refcnt stores the object’s reference count.

Reference counting is CPython’s primary object lifetime mechanism. When code creates, stores, returns, or releases object references, CPython updates this count.

Simplified:

#define Py_INCREF(op) ((op)->ob_refcnt++)
#define Py_DECREF(op)                         \
    do {                                      \
        if (--(op)->ob_refcnt == 0) {         \
            deallocate_object(op);            \
        }                                     \
    } while (0)

The real implementation is more complex. It handles immortal objects, debug hooks, tracing, free-threaded builds, and deallocation details.

The conceptual rule is:

new strong reference acquired
    increment reference count

strong reference released
    decrement reference count

reference count reaches zero
    destroy object

Example:

x = []
y = x
del x
del y

The list object survives while at least one strong reference remains.

8.4 ob_type

ob_type points to the object’s type object.

For this Python code:

x = []

the list object’s ob_type points to the list type object.

Conceptually:

x  --->  PyListObject
            ob_refcnt
            ob_type  ---->  PyList_Type
            ob_size
            ob_item
            allocated

The type object describes behavior:

object name
object size
base classes
method table
attribute lookup behavior
call behavior
deallocation behavior
number operations
sequence operations
mapping operations

This is how CPython dispatches operations.

When code evaluates:

len(x)

CPython checks the type’s length slot.

When code evaluates:

x[0]

CPython checks sequence or mapping behavior.

When code evaluates:

x + y

CPython checks numeric or sequence concatenation slots depending on the types.

8.5 PyObject_HEAD

Extension types do not usually write the fields manually. They use macros.

A fixed-size extension object often starts like this:

typedef struct {
    PyObject_HEAD
    long value;
} CounterObject;

PyObject_HEAD expands to the fields required for the object header.

Conceptually:

typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    long value;
} CounterObject;

The macro exists because CPython may change header details depending on build configuration. Extension code should use the macro rather than assuming the exact fields.

8.6 Fixed-Size Objects

A fixed-size object has the same C struct size for every instance of that type.

Example shape:

typedef struct {
    PyObject_HEAD
    double value;
} FloatLikeObject;

Every instance has room for exactly one double.

Many objects are fixed-size at the object struct level:

float
module
function
method
cell
weakref
many iterator objects
many descriptor objects

The object may still refer to external or separately allocated data. Fixed-size means the object struct itself has a fixed size, not that the complete logical object has no auxiliary storage.

A function object is fixed-size as a struct, but it points to other objects such as its code object, globals dict, defaults tuple, closure tuple, and annotations.

8.7 PyVarObject

Variable-size objects use PyVarObject.

A simplified layout:

typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size;
} PyVarObject;

It extends PyObject with an additional field:

ob_size

ob_size usually stores the logical size of the object.

Examples:

tuple length
bytes length
integer digit count
some internal variable-sized arrays

The common shape:

PyVarObject
    ob_refcnt
    ob_type
    ob_size
    variable object payload

8.8 PyObject_VAR_HEAD

Variable-size extension types use:

typedef struct {
    PyObject_VAR_HEAD
    PyObject *items[1];
} SmallArrayObject;

Conceptually:

typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    Py_ssize_t ob_size;
    PyObject *items[1];
} SmallArrayObject;

The final array is often a flexible or variable-length payload. CPython allocates enough memory for the header plus the desired number of elements.

This pattern is used when the object’s size is fixed after allocation.

Tuples are the canonical example. A tuple’s length does not change after creation, so CPython can allocate one block containing the object header and item references.

8.9 What ob_size Means

ob_size does not mean “number of bytes occupied by this object.”

It means a type-specific size value.

For a tuple, it is the number of elements.

For bytes, it is the number of bytes.

For long integers, it is related to the number of internal base digits and may encode sign.

For a custom type, the meaning depends on the type’s implementation.

This is important. ob_size is interpreted by the object’s type-specific code.

same field
different meaning per type

The type object knows how to interpret its own instances.

8.10 Tuple Layout

A tuple is a good example of a variable-size object.

Conceptually:

typedef struct {
    PyObject_VAR_HEAD
    PyObject *ob_item[1];
} PyTupleObject;

A tuple of length 3 is allocated as one object with space for three item references:

PyTupleObject
    ob_refcnt
    ob_type  ---> tuple type
    ob_size = 3
    ob_item[0] ---> object A
    ob_item[1] ---> object B
    ob_item[2] ---> object C

The tuple owns references to its items. When the tuple is destroyed, it decrements the reference count of each contained object.

The tuple does not own the objects exclusively. It owns references.

a = []
t = (a,)

The tuple stores a reference to the list. The name a also stores a reference to the same list.

8.11 List Layout

A list is also variable-length at the Python level, but its implementation differs from tuple.

A list object has a fixed-size struct that points to a separately allocated array of item references.

Conceptually:

typedef struct {
    PyObject_VAR_HEAD
    PyObject **ob_item;
    Py_ssize_t allocated;
} PyListObject;

Shape:

PyListObject
    ob_refcnt
    ob_type  ---> list type
    ob_size = current length
    ob_item  ----> separately allocated array
    allocated = current capacity

For a list:

xs = [10, 20, 30]

the memory shape is approximately:

list object
    ob_size = 3
    allocated >= 3
    ob_item ----+
                |
                v
              [ ptr to 10 ][ ptr to 20 ][ ptr to 30 ][ spare capacity... ]

This allows efficient append. The list can grow by reallocating the separate item array without moving the list object itself.

Object identity remains stable:

xs = []
before = id(xs)

xs.append(1)
xs.append(2)
xs.append(3)

after = id(xs)

print(before == after)   # True

The list’s internal array may move. The list object itself remains the same object.

8.12 Why Objects Do Not Move

CPython generally does not move live objects.

A PyObject * is a direct pointer. Many parts of CPython and many C extensions may hold that pointer.

If CPython moved an object in memory, it would have to find and update every pointer to it. That would be expensive and incompatible with much C extension code.

So CPython uses a non-moving object model.

Consequences:

object identity can be represented by address in CPython
C extensions can hold PyObject * pointers
objects are not compacted by a moving garbage collector
memory fragmentation must be managed differently

This is one reason CPython’s allocator design matters.

8.13 Casting Between Object Types

Because every object starts with a common header, CPython can cast concrete object pointers to PyObject *.

Example:

PyObject *obj = (PyObject *)some_list;

But the reverse cast is only safe after type checking.

if (PyList_Check(obj)) {
    PyListObject *list = (PyListObject *)obj;
}

Unsafe casting can corrupt memory or crash the interpreter.

Correct extension code follows this pattern:

static PyObject *
get_size(PyObject *self, PyObject *arg)
{
    if (!PyList_Check(arg)) {
        PyErr_SetString(PyExc_TypeError, "expected list");
        return NULL;
    }

    Py_ssize_t n = PyList_GET_SIZE(arg);
    return PyLong_FromSsize_t(n);
}

The PyList_GET_SIZE macro assumes its argument is a list. The checked API variant is safer when the type is uncertain.

8.14 Checked APIs and Fast Macros

CPython exposes both checked functions and fast macros.

Checked form:

Py_ssize_t n = PyList_Size(obj);

Fast macro form:

Py_ssize_t n = PyList_GET_SIZE(obj);

The checked function validates the object and reports an error if the input is invalid.

The macro assumes the object is already valid and may directly access fields.

Tradeoff:

FormSafetySpeedUse case
Checked functionHigherLowerPublic boundary, uncertain input
Fast macroLowerHigherInternal code after validation

This pattern appears throughout the C API.

8.15 Type Object Size Fields

The type object describes instance size.

Important fields include:

tp_basicsize
tp_itemsize

tp_basicsize is the fixed part of each instance.

tp_itemsize is the size of each variable-size item for variable-size objects.

For a fixed-size type:

tp_basicsize = sizeof(MyObject)
tp_itemsize = 0

For a variable-size type:

tp_basicsize = base header and fixed fields
tp_itemsize = size of each trailing item

Allocation can then compute:

total size = tp_basicsize + n * tp_itemsize

This is how CPython can allocate one memory block for objects such as tuples.

8.16 Minimal Fixed-Size Extension Object

A minimal fixed-size object layout:

typedef struct {
    PyObject_HEAD
    long value;
} CounterObject;

A minimal type object sketch:

static PyTypeObject CounterType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "example.Counter",
    .tp_basicsize = sizeof(CounterObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
};

The important point is structural.

CounterObject starts with PyObject header.
CounterType says how large CounterObject is.
CPython allocates memory according to CounterType.
CPython treats the result as PyObject * at generic boundaries.

8.17 Minimal Variable-Size Extension Object

A variable-size object layout might look like:

typedef struct {
    PyObject_VAR_HEAD
    PyObject *items[1];
} FixedArrayObject;

The type object would use a nonzero item size:

static PyTypeObject FixedArrayType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "example.FixedArray",
    .tp_basicsize = offsetof(FixedArrayObject, items),
    .tp_itemsize = sizeof(PyObject *),
    .tp_flags = Py_TPFLAGS_DEFAULT,
};

Allocation would request a specific logical length.

Conceptually:

allocate FixedArray with n items
    total bytes = tp_basicsize + n * tp_itemsize
    ob_size = n

This layout is useful when the number of contained references is known at creation time and does not change afterward.

8.18 Object Header Macros

Common macros include:

Py_REFCNT(obj)
Py_TYPE(obj)
Py_SIZE(obj)

Their conceptual meanings:

Py_REFCNT(obj)
    get reference count

Py_TYPE(obj)
    get type pointer

Py_SIZE(obj)
    get variable-size field

For example:

PyTypeObject *type = Py_TYPE(obj);

and:

Py_ssize_t n = Py_SIZE(tuple_obj);

Extension code should prefer official macros and functions over direct field access. This makes code more compatible with CPython changes.

8.19 Reference Ownership and Headers

The object header stores the count. It does not explain ownership by itself.

Ownership is a convention enforced by API rules.

A function returning a new reference transfers ownership to the caller:

PyObject *x = PyLong_FromLong(42);
/* caller owns x */
Py_DECREF(x);

A function returning a borrowed reference does not transfer ownership:

PyObject *item = PyList_GetItem(list, 0);
/* borrowed reference, do not DECREF unless INCREF first */

The same object header is involved in both cases. The difference is the contract of the API call.

This is why CPython extension programming is difficult. The memory layout is simple, but the ownership rules require discipline.

8.20 Object Initialization

Allocation and initialization are separate.

For a type object, allocation is usually handled by:

tp_alloc

Object construction may involve:

tp_new
tp_init

At Python level:

obj = MyClass(...)

roughly means:

call type object
    call tp_new to allocate or return object
    call tp_init to initialize object
    return object

For immutable objects, tp_new often does most of the work because the value must be established before the object is exposed.

For mutable objects, tp_init can fill in state after allocation.

8.21 Deallocation

When an object’s reference count reaches zero, CPython calls the type-specific deallocator.

The type object stores this in:

tp_dealloc

A deallocator usually performs these steps:

release references owned by the object
free auxiliary buffers
untrack from cyclic GC if needed
free object memory

For a container, the deallocator must decrement references to contained objects.

Example shape:

static void
Counter_dealloc(CounterObject *self)
{
    Py_TYPE(self)->tp_free((PyObject *)self);
}

For a container:

static void
Array_dealloc(ArrayObject *self)
{
    for (Py_ssize_t i = 0; i < Py_SIZE(self); i++) {
        Py_XDECREF(self->items[i]);
    }

    Py_TYPE(self)->tp_free((PyObject *)self);
}

This is simplified. Real code must handle garbage collector tracking and error-safe invariants.

8.22 Garbage Collector Header

Objects that participate in cyclic garbage collection may have an additional GC header before the visible PyObject header.

Conceptually:

GC header
PyObject header
type-specific payload

The PyObject * points to the object header, not to the GC header.

memory block start
    GC metadata
    ob_refcnt       <--- PyObject * points here
    ob_type
    payload

The GC header links the object into collector structures.

Only container-like objects that can participate in cycles usually need this tracking.

An extension type that owns references to other Python objects and can participate in cycles must implement the GC protocol correctly.

8.23 Debug Builds

Debug builds may add extra fields or checks around objects.

This can include:

reference count debugging
allocator padding
forbidden bytes around memory blocks
extra assertions
API misuse detection

For this reason, extension code should avoid assuming exact raw memory layout beyond documented macros.

Bad style:

obj->ob_refcnt++;

Better style:

Py_INCREF(obj);

Bad style:

obj->ob_type

Better style:

Py_TYPE(obj)

The macros are the compatibility layer between extension code and CPython internals.

8.24 Immortal Objects

Modern CPython has the concept of immortal objects for selected long-lived objects. An immortal object uses a special reference count value and avoids normal reference count destruction.

Typical candidates include fundamental singleton-like or runtime-owned objects.

The practical point for internals readers:

not every reference count behaves like an ordinary small integer
not every INCREF or DECREF has the same runtime effect
never write code that depends on exact refcount arithmetic unless working inside CPython internals

At the Python level, reference count inspection is already implementation-specific. At the C level, extension authors should use the official reference management APIs.

8.25 Why PyObject Matters

PyObject is the common currency of the CPython runtime.

The interpreter loop pushes and pops PyObject * values.

Function calls pass PyObject * arguments.

Containers store PyObject * references.

C extension APIs receive and return PyObject *.

Type slots operate on PyObject *.

Errors are represented by Python exception objects.

Modules are Python objects.

Classes are Python objects.

Functions are Python objects.

Code objects are Python objects.

This uniformity is what makes Python dynamic. The runtime can manipulate arbitrary values through one representation while dispatching behavior through type objects.

8.26 Useful Mental Model

When reading CPython C code, assume this shape first:

PyObject *
    points to object header
        ob_refcnt
        ob_type
    followed by type-specific memory

Then ask:

What concrete type is this object expected to be?
Is it fixed-size or variable-size?
Who owns this reference?
Can this object participate in reference cycles?
What type slot handles this operation?
Does this API return a new reference or borrowed reference?

These questions prevent most early confusion when reading CPython internals.

8.27 Summary

PyObject is the base header for CPython objects. It gives each object a reference count and a type pointer.

PyVarObject extends that header with a size field used by variable-size objects.

Fixed-size objects use PyObject_HEAD.

Variable-size objects use PyObject_VAR_HEAD.

The object header makes CPython’s dynamic object system possible. It allows generic runtime code to handle many concrete object layouts through PyObject *, while type objects provide the behavior needed for calls, arithmetic, indexing, attributes, deallocation, and protocol dispatch.