# 9. Reference Counting

# 9. Reference Counting

Reference counting is CPython’s primary memory management mechanism. Every ordinary object carries a count of strong references that currently point to it. When that count falls to zero, CPython can destroy the object immediately.

This design is one of the clearest differences between CPython and many other language runtimes. CPython does have a cyclic garbage collector, but the collector supplements reference counting. Most objects are reclaimed by reference count transitions, not by periodic tracing.

## 9.1 The Core Idea

A Python object remains alive while something owns a reference to it.

Conceptually:

```text
object created
    reference count = 1

another owner stores the object
    reference count += 1

an owner releases the object
    reference count -= 1

reference count reaches 0
    object is deallocated
```

Example:

```python
x = []
y = x

del x
del y
```

The list object is created and bound to `x`. Binding `y = x` creates another reference to the same list. Deleting `x` removes one reference. Deleting `y` removes the remaining reference, so the list can be destroyed.

At the C level, this is controlled by the object header:

```c
typedef struct {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;
```

`ob_refcnt` is the reference count field.

## 9.2 Strong References

A strong reference keeps an object alive.

Most normal Python bindings are strong references:

```python
x = object()
items = [x]
d = {"key": x}
```

Here, the object has references from:

```text
local name x
list element items[0]
dictionary value d["key"]
```

As long as at least one strong reference remains, the object cannot be destroyed.

Containers hold strong references to their elements. A list does not store raw values. It stores pointers to Python objects and owns references to them.

```python
a = []
b = [a]
```

The list `b` owns a reference to the list `a`.

## 9.3 Borrowed References and Owned References

At the C API level, reference counting is controlled by ownership rules.

There are two common categories:

| Reference kind     | Meaning                                                                |
| ------------------ | ---------------------------------------------------------------------- |
| New reference      | Caller owns the reference and must release it                          |
| Borrowed reference | Caller may use the object temporarily but does not own a new reference |

Example of a new reference:

```c
PyObject *x = PyLong_FromLong(42);

/* x is a new reference */
Py_DECREF(x);
```

`PyLong_FromLong` creates or returns an object and gives the caller ownership of one reference.

Example of a borrowed reference:

```c
PyObject *item = PyList_GetItem(list, 0);

/* item is borrowed */
```

The caller must not call `Py_DECREF(item)` unless it first converts the borrowed reference into an owned reference with `Py_INCREF`.

Correct pattern:

```c
PyObject *item = PyList_GetItem(list, 0);
if (item == NULL) {
    return NULL;
}

Py_INCREF(item);
/* now item is owned */

/* use item */

Py_DECREF(item);
```

Borrowed references are efficient but dangerous. If the container that owns the object is modified or destroyed, the borrowed pointer may become invalid.

## 9.4 `Py_INCREF`

`Py_INCREF` records that one more owner now holds a strong reference.

Conceptually:

```c
Py_INCREF(obj);
```

means:

```text
I am going to keep this object alive.
```

Typical cases:

```text
store object into a container
store object into a struct field
return an existing object as a new reference
keep a borrowed reference beyond its safe lifetime
```

Example:

```c
typedef struct {
    PyObject_HEAD
    PyObject *value;
} BoxObject;
```

If a `BoxObject` stores another Python object, it must increment the reference count:

```c
static int
Box_set_value(BoxObject *self, PyObject *value)
{
    Py_INCREF(value);
    Py_XDECREF(self->value);
    self->value = value;
    return 0;
}
```

The box becomes an owner of `value`.

## 9.5 `Py_DECREF`

`Py_DECREF` releases ownership of a reference.

Conceptually:

```c
Py_DECREF(obj);
```

means:

```text
I no longer need to keep this object alive.
```

If the reference count reaches zero, CPython destroys the object.

Simplified:

```c
#define Py_DECREF(op)                  \
    do {                               \
        if (--(op)->ob_refcnt == 0) {  \
            dealloc(op);               \
        }                              \
    } while (0)
```

The real implementation is more complex, but this captures the rule.

Every new reference must eventually be released.

```c
PyObject *x = PyLong_FromLong(42);
if (x == NULL) {
    return NULL;
}

/* use x */

Py_DECREF(x);
```

Forgetting the `Py_DECREF` leaks the object.

Calling `Py_DECREF` too many times can destroy an object while it is still in use.

## 9.6 `Py_XINCREF` and `Py_XDECREF`

Some pointers may be `NULL`.

`Py_INCREF` and `Py_DECREF` require a valid object pointer. Their `X` variants accept `NULL`.

```c
Py_XINCREF(obj);
Py_XDECREF(obj);
```

They are commonly used for optional fields:

```c
typedef struct {
    PyObject_HEAD
    PyObject *name;   /* may be NULL */
} UserObject;
```

Deallocator:

```c
static void
User_dealloc(UserObject *self)
{
    Py_XDECREF(self->name);
    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

If `self->name` is `NULL`, `Py_XDECREF` does nothing.

## 9.7 Assignment at the Python Level

Python assignment changes bindings.

```python
x = []
x = {}
```

The first assignment binds `x` to a list. The second assignment rebinds `x` to a dict.

At the CPython level, rebinding roughly means:

```text
increment reference to new object
store new pointer in local variable slot
decrement reference to previous object
```

The order matters. CPython must avoid destroying an object too early, especially when assignments involve overlapping references or error paths.

Example:

```python
x = []
y = x
x = None
```

After `x = None`, the list remains alive because `y` still refers to it.

## 9.8 Containers Own References

When an object is inserted into a container, the container owns a reference.

```python
x = object()
items = []
items.append(x)
```

The object is referenced by both `x` and `items[0]`.

At the C level, a container must increment the reference count when it stores an object.

Simplified list append logic:

```text
append item to list
    ensure capacity
    increment item reference count
    store item pointer
    increase list size
```

When the list is destroyed, it releases references to its elements:

```text
destroy list
    for each item:
        decrement item reference count
    free item array
    free list object
```

This ownership model is recursive. Destroying a container may trigger destruction of contained objects if no other references exist.

## 9.9 Replacing References Safely

A common C pattern is replacing one owned field with another.

Incorrect:

```c
Py_DECREF(self->value);
self->value = new_value;
Py_INCREF(new_value);
```

This can fail if `new_value` is the same object as `self->value`. The `Py_DECREF` might destroy the object before the increment happens.

Correct:

```c
Py_INCREF(new_value);
Py_DECREF(self->value);
self->value = new_value;
```

For nullable fields:

```c
Py_XINCREF(new_value);
Py_XDECREF(self->value);
self->value = new_value;
```

This pattern is simple but important:

```text
increment new reference first
decrement old reference second
store pointer when ownership is safe
```

## 9.10 Reference Stealing

Some C API functions steal references.

A function that steals a reference takes ownership from the caller. After calling it successfully, the caller must not decrement that reference.

Example pattern:

```c
PyObject *value = PyLong_FromLong(42);
if (value == NULL) {
    return NULL;
}

if (PyTuple_SetItem(tuple, 0, value) < 0) {
    Py_DECREF(value);
    return NULL;
}

/* do not Py_DECREF(value) here */
```

`PyTuple_SetItem` steals a reference to `value` on success.

This convention exists for efficiency, especially when building containers from newly created objects.

Reference stealing is a frequent source of bugs. The rule must be read from the API documentation or known C API convention. It cannot be inferred from the object pointer type.

## 9.11 Error Paths

Reference counting bugs often occur on error paths.

Example:

```c
PyObject *a = PyLong_FromLong(1);
if (a == NULL) {
    return NULL;
}

PyObject *b = PyLong_FromLong(2);
if (b == NULL) {
    return NULL;   /* leak: a was not released */
}
```

Correct:

```c
PyObject *a = PyLong_FromLong(1);
if (a == NULL) {
    return NULL;
}

PyObject *b = PyLong_FromLong(2);
if (b == NULL) {
    Py_DECREF(a);
    return NULL;
}
```

A common style uses one cleanup block:

```c
PyObject *a = NULL;
PyObject *b = NULL;
PyObject *result = NULL;

a = PyLong_FromLong(1);
if (a == NULL) {
    goto error;
}

b = PyLong_FromLong(2);
if (b == NULL) {
    goto error;
}

result = PyNumber_Add(a, b);
if (result == NULL) {
    goto error;
}

Py_DECREF(a);
Py_DECREF(b);
return result;

error:
Py_XDECREF(a);
Py_XDECREF(b);
return NULL;
```

This pattern scales better as more owned references are introduced.

## 9.12 Returning References

A C function exposed to Python usually returns a new reference.

Example:

```c
static PyObject *
answer(PyObject *self, PyObject *args)
{
    return PyLong_FromLong(42);
}
```

The returned object is a new reference. The interpreter receives it and takes responsibility for it.

If returning an existing object, increment first:

```c
static PyObject *
get_cached_value(MyObject *self, PyObject *Py_UNUSED(ignored))
{
    Py_INCREF(self->cached_value);
    return self->cached_value;
}
```

Returning a borrowed reference without incrementing is a serious bug:

```c
static PyObject *
bad_get_cached_value(MyObject *self, PyObject *Py_UNUSED(ignored))
{
    return self->cached_value;   /* wrong if this is borrowed */
}
```

The caller expects ownership of the return value. If the reference count was not incremented, later decrements can underflow ownership and cause use-after-free behavior.

## 9.13 `None`, `True`, and `False`

Returning `None` is common.

Use:

```c
Py_RETURN_NONE;
```

Conceptually, this increments `None` and returns it.

Equivalent shape:

```c
Py_INCREF(Py_None);
return Py_None;
```

Similarly:

```c
Py_RETURN_TRUE;
Py_RETURN_FALSE;
```

These macros avoid mistakes and make intent clear.

At Python level:

```python
def f():
    return None
```

At C level, the function must still return an owned reference to the `None` object.

## 9.14 Temporary References

Many operations create temporary references.

Example:

```c
PyObject *a = PyLong_FromLong(10);
PyObject *b = PyLong_FromLong(20);
PyObject *sum = PyNumber_Add(a, b);
```

Here, `a`, `b`, and `sum` are all owned references if creation succeeds.

Correct cleanup:

```c
Py_DECREF(a);
Py_DECREF(b);

return sum;
```

Do not decrement `sum` before returning it, because the caller receives ownership.

This division is important:

```text
objects used temporarily inside the function
    DECREF before return

object returned to caller
    return as owned reference
```

## 9.15 Reference Counts and Function Calls

When a Python function is called, arguments are passed as object references.

```python
def f(x):
    return x

obj = []
y = f(obj)
```

The object is not copied. The function receives a reference to the same list.

At the interpreter level, function call machinery manages references for argument objects, local variables, stack values, and return values.

Conceptually:

```text
caller has reference to obj
call passes reference into callee frame
callee local x refers to same object
return value refers to same object
callee frame is cleared
caller receives returned reference
```

The exact mechanics are optimized, but the ownership invariant remains: every live reference must be accounted for.

## 9.16 Stack References

The bytecode interpreter uses a value stack inside each frame.

Bytecode instructions push and pop object references.

Example:

```python
a + b
```

Conceptually:

```text
LOAD_FAST a
    push reference to a

LOAD_FAST b
    push reference to b

BINARY_OP +
    pop two references
    compute result
    push result reference
```

Stack entries are references. When values are popped and no longer needed, CPython must release the corresponding references.

The evaluation loop is therefore a large reference-management system. It must preserve ownership across normal execution, exceptions, jumps, returns, and frame teardown.

## 9.17 Reference Counting and Exceptions

Error handling requires careful reference cleanup.

Suppose an operation fails after creating temporary objects:

```c
PyObject *name = PyUnicode_FromString("field");
if (name == NULL) {
    return NULL;
}

PyObject *value = PyObject_GetAttr(obj, name);
Py_DECREF(name);

if (value == NULL) {
    return NULL;
}

return value;
```

If `PyObject_GetAttr` fails, it returns `NULL` and sets an exception. The temporary `name` still must be decremented before returning.

A C function reports an exception by returning `NULL` while an exception is set.

Reference cleanup and exception propagation are separate responsibilities:

```text
release owned references
preserve or set exception state
return NULL
```

## 9.18 Reference Leaks

A reference leak happens when a strong reference is never released.

Example:

```c
static PyObject *
leaky(PyObject *self, PyObject *args)
{
    PyObject *x = PyLong_FromLong(42);
    if (x == NULL) {
        return NULL;
    }

    Py_RETURN_NONE;   /* leak: x was never DECREFed */
}
```

Correct:

```c
static PyObject *
fixed(PyObject *self, PyObject *args)
{
    PyObject *x = PyLong_FromLong(42);
    if (x == NULL) {
        return NULL;
    }

    Py_DECREF(x);
    Py_RETURN_NONE;
}
```

Reference leaks may be small but serious in long-running processes:

```text
web servers
workers
language servers
notebooks
daemon processes
embedded Python runtimes
```

A leak in a hot path can grow without bound.

## 9.19 Use-After-Free

A use-after-free happens when code uses a pointer after the referenced object has been destroyed.

Example pattern:

```c
PyObject *item = PyList_GetItem(list, 0);  /* borrowed */

Py_DECREF(list);

/* item may now be invalid */
PyObject_Repr(item);
```

If `list` owned the last strong reference to `item`, destroying `list` also destroyed `item`.

Correct:

```c
PyObject *item = PyList_GetItem(list, 0);
if (item == NULL) {
    return NULL;
}

Py_INCREF(item);

Py_DECREF(list);

/* item is still alive */
PyObject *repr = PyObject_Repr(item);

Py_DECREF(item);
return repr;
```

Borrowed references must not outlive their owner unless converted into owned references.

## 9.20 Cycles

Reference counting alone cannot reclaim cycles.

Example:

```python
a = []
b = []

a.append(b)
b.append(a)

del a
del b
```

After deleting `a` and `b`, the two lists still reference each other.

Conceptually:

```text
list A ---> list B
list B ---> list A
```

Their reference counts do not reach zero, even though the program can no longer reach them.

This is why CPython has a cyclic garbage collector. It finds unreachable groups of container objects and breaks them safely.

Reference counting handles the common case. Cyclic GC handles the cases reference counting cannot see.

## 9.21 Finalizers and Destruction Timing

CPython often destroys objects immediately when their last reference disappears.

Example:

```python
class Resource:
    def __del__(self):
        print("destroyed")

r = Resource()
del r
```

In CPython, `__del__` often runs immediately after `del r`, assuming no other references exist.

But portable Python code should avoid relying on exact destruction timing. Other implementations may use different garbage collection strategies.

Use context managers for deterministic resource management:

```python
with open("data.txt") as f:
    data = f.read()
```

This is better than relying on file object destruction:

```python
f = open("data.txt")
data = f.read()
f = None
```

Reference counting gives CPython prompt cleanup, but `with` expresses the lifetime directly.

## 9.22 Reference Counts Are Observable but Implementation-Specific

CPython exposes reference counts through `sys.getrefcount`.

```python
import sys

x = []
print(sys.getrefcount(x))
```

The reported number is usually higher than expected because passing `x` into `getrefcount` creates a temporary reference.

Example:

```python
import sys

x = []
print(sys.getrefcount(x))   # often 2, not 1
```

One reference is from the local name `x`. Another temporary reference is from the function call.

Use this tool carefully. It is useful for learning and debugging CPython behavior, but it exposes implementation details.

## 9.23 Immortal Objects

Modern CPython uses immortal objects for selected values that are intended to live for the process lifetime.

An immortal object does not behave like an ordinary refcounted object in the simple sense. Its reference count may use a special value, and normal increment or decrement operations may avoid changing its effective lifetime.

This optimization reduces overhead for very common objects and simplifies some runtime invariants.

Important examples include objects that are fundamental to the runtime, such as singleton-like values and frequently reused internal objects.

The rule for extension authors remains simple:

```text
always use Py_INCREF, Py_DECREF, Py_XINCREF, and Py_XDECREF
do not inspect or modify ob_refcnt directly
do not assume reference counts are ordinary small integers
```

Correct code works with ordinary and immortal objects.

## 9.24 Free-Threaded CPython and Reference Counting

Traditional CPython protects many reference count updates with the Global Interpreter Lock. In free-threaded builds, reference management needs additional machinery because multiple threads may execute Python code at the same time.

This affects internal implementation details, but the C API ownership rules remain the conceptual contract:

```text
own a new reference
    release it exactly once

borrow a reference
    do not release it

keep a borrowed reference longer
    increment it first
```

Internals may use biased reference counting, deferred reference handling, atomic operations, or other techniques depending on build mode and version.

Extension code should avoid relying on raw reference count layout or update mechanics.

## 9.25 Reference Counting Discipline

A practical discipline for C extension code:

| Situation                             | Action                                    |
| ------------------------------------- | ----------------------------------------- |
| Function returns a new reference      | `Py_DECREF` when done                     |
| Function returns a borrowed reference | Do not `Py_DECREF`                        |
| Store object in a field               | `Py_INCREF` before storing                |
| Replace object in a field             | `Py_INCREF` new, then `Py_DECREF` old     |
| Return existing object                | `Py_INCREF` before returning              |
| Error after acquiring references      | Release owned references before returning |
| Pointer may be `NULL`                 | Use `Py_XDECREF`                          |
| API steals reference                  | Do not release after successful transfer  |

Most reference bugs come from violating one of these rules.

## 9.26 Minimal Correct Field Owner

This is a small object that owns one Python object field.

```c
typedef struct {
    PyObject_HEAD
    PyObject *value;
} BoxObject;
```

Initialization:

```c
static int
Box_init(BoxObject *self, PyObject *args, PyObject *kwds)
{
    PyObject *value = NULL;

    if (!PyArg_ParseTuple(args, "O", &value)) {
        return -1;
    }

    Py_INCREF(value);
    self->value = value;

    return 0;
}
```

Deallocation:

```c
static void
Box_dealloc(BoxObject *self)
{
    Py_XDECREF(self->value);
    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

Getter:

```c
static PyObject *
Box_get_value(BoxObject *self, void *closure)
{
    Py_INCREF(self->value);
    return self->value;
}
```

Setter:

```c
static int
Box_set_value(BoxObject *self, PyObject *value, void *closure)
{
    if (value == NULL) {
        PyErr_SetString(PyExc_TypeError, "cannot delete value");
        return -1;
    }

    Py_INCREF(value);
    Py_XDECREF(self->value);
    self->value = value;

    return 0;
}
```

This shows the basic ownership pattern:

```text
own stored field
release stored field during destruction
return stored field as a new reference
replace field safely
```

## 9.27 Mental Model

When reading CPython C code, ask these questions for every `PyObject *`:

```text
Who owns this reference?
Was this returned as a new reference or borrowed reference?
Can this pointer be NULL?
What happens on the error path?
Does this container steal the reference?
Does this field need to INCREF when assigned?
Does this deallocator DECREF everything it owns?
Can a DECREF run arbitrary Python code through finalizers?
```

The last question is subtle. Decrementing a reference can destroy an object. Destroying an object can call finalizers. Finalizers can run Python code. Python code can mutate global state. Therefore, `Py_DECREF` can have large side effects.

## 9.28 Summary

Reference counting is CPython’s main lifetime mechanism. Each ordinary object records how many strong references point to it. `Py_INCREF` acquires ownership. `Py_DECREF` releases ownership. When the count reaches zero, CPython destroys the object through its type-specific deallocator.

The hard part is not the counter itself. The hard part is ownership discipline. C extension code must distinguish new references, borrowed references, stolen references, nullable references, temporary references, stored references, and returned references.

Reference counting explains much of CPython’s behavior: prompt destruction, deterministic cleanup in many cases, extension module rules, container ownership, and the need for a separate cyclic garbage collector.

