# 64. Defining New Types

# 64. Defining New Types

A CPython extension module can expose new Python types implemented in C. These types behave like normal Python classes from user code, but their memory layout, allocation, methods, attribute access, numeric operations, sequence operations, and deallocation logic are controlled by C structures and function pointers.

A built-in type such as `list`, `dict`, `int`, or `str` is implemented this way. Extension modules can use the same mechanism to define native object types.

## 64.1 What a C Extension Type Is

A C extension type is a Python type whose instances are backed by a C structure.

Python sees this:

```python
p = Point(10, 20)
print(p.x)
print(p.y)
print(p.length())
```

CPython sees an object layout like this:

```c
typedef struct {
    PyObject_HEAD
    double x;
    double y;
} PointObject;
```

The object begins with the standard Python object header. After that header, the extension stores its own fields.

Conceptually:

```text
PointObject
    PyObject header
        reference count
        type pointer
    native fields
        x
        y
```

This gives Python-level code a normal object while C-level code gets compact, predictable storage.

## 64.2 `PyObject_HEAD`

Every Python object starts with a CPython object header.

Extension types normally embed it with:

```c
PyObject_HEAD
```

Example:

```c
typedef struct {
    PyObject_HEAD
    long value;
} CounterObject;
```

`PyObject_HEAD` expands to the fields CPython needs for all objects. Extension code should not assume its exact textual expansion. It should treat it as the required object prefix.

The result is that a `CounterObject *` can also be treated as a `PyObject *`:

```c
CounterObject *self;
PyObject *obj = (PyObject *)self;
```

This cast is safe because the object header is at the beginning of the structure.

## 64.3 `PyTypeObject`

A type is described by a `PyTypeObject`.

Minimal shape:

```c
static PyTypeObject PointType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "geometry.Point",
    .tp_basicsize = sizeof(PointObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_new = Point_new,
    .tp_init = (initproc)Point_init,
    .tp_dealloc = (destructor)Point_dealloc,
    .tp_methods = Point_methods,
    .tp_members = Point_members,
};
```

Important fields:

| Field | Purpose |
|---|---|
| `tp_name` | Fully qualified type name |
| `tp_basicsize` | Size of one instance |
| `tp_itemsize` | Extra per-item size for variable-sized objects |
| `tp_flags` | Runtime flags |
| `tp_new` | Allocation and construction entry |
| `tp_init` | Initialization after allocation |
| `tp_dealloc` | Destruction |
| `tp_methods` | Python-visible methods |
| `tp_members` | Simple C struct fields exposed as attributes |
| `tp_getset` | Computed properties |
| `tp_repr` | `repr(obj)` implementation |
| `tp_str` | `str(obj)` implementation |
| `tp_as_number` | Numeric protocol |
| `tp_as_sequence` | Sequence protocol |
| `tp_as_mapping` | Mapping protocol |
| `tp_iter` | Iterator protocol |

A type object is the runtime description of object behavior.

## 64.4 Allocation with `tp_new`

`tp_new` creates a new object. It corresponds roughly to `__new__`.

Example:

```c
static PyObject *
Point_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PointObject *self;

    self = (PointObject *)type->tp_alloc(type, 0);
    if (self == NULL) {
        return NULL;
    }

    self->x = 0.0;
    self->y = 0.0;

    return (PyObject *)self;
}
```

The important call is:

```c
type->tp_alloc(type, 0)
```

This allocates memory using CPython’s object allocator and initializes the object header.

Do not use `malloc` directly for normal Python objects.

## 64.5 Initialization with `tp_init`

`tp_init` initializes an already allocated object. It corresponds roughly to `__init__`.

Example:

```c
static int
Point_init(PointObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"x", "y", NULL};

    if (!PyArg_ParseTupleAndKeywords(
            args,
            kwds,
            "dd",
            kwlist,
            &self->x,
            &self->y)) {
        return -1;
    }

    return 0;
}
```

Return values:

| Return | Meaning |
|---|---|
| `0` | Success |
| `-1` | Failure with exception set |

If initialization fails, CPython deallocates the partially constructed object.

## 64.6 Deallocation with `tp_dealloc`

`tp_dealloc` releases resources owned by the object.

For a simple object with only primitive C fields:

```c
static void
Point_dealloc(PointObject *self)
{
    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

For an object that owns Python references:

```c
typedef struct {
    PyObject_HEAD
    PyObject *name;
} UserObject;
```

deallocation must release them:

```c
static void
User_dealloc(UserObject *self)
{
    Py_XDECREF(self->name);
    Py_TYPE(self)->tp_free((PyObject *)self);
}
```

The object’s own memory must be freed after owned references are released.

## 64.7 Exposing C Fields with `PyMemberDef`

Simple fields can be exposed through member definitions.

```c
static PyMemberDef Point_members[] = {
    {"x", T_DOUBLE, offsetof(PointObject, x), 0, "x coordinate"},
    {"y", T_DOUBLE, offsetof(PointObject, y), 0, "y coordinate"},
    {NULL}
};
```

This makes attributes available:

```python
p = Point(1.0, 2.0)
print(p.x)
p.y = 5.0
```

Common member types:

| Type | C field |
|---|---|
| `T_INT` | `int` |
| `T_LONG` | `long` |
| `T_DOUBLE` | `double` |
| `T_OBJECT` | `PyObject *` |
| `T_OBJECT_EX` | `PyObject *` with better missing-attribute behavior |

For Python object fields, direct exposure must be used carefully because reference ownership and assignment semantics matter.

## 64.8 Methods with `PyMethodDef`

Methods are exposed with `PyMethodDef`.

```c
static PyObject *
Point_length(PointObject *self, PyObject *Py_UNUSED(ignored))
{
    double len = sqrt(self->x * self->x + self->y * self->y);
    return PyFloat_FromDouble(len);
}

static PyMethodDef Point_methods[] = {
    {"length", (PyCFunction)Point_length, METH_NOARGS, "Return vector length"},
    {NULL}
};
```

Python usage:

```python
p = Point(3, 4)
print(p.length())
```

The method receives `self` as a pointer to the native object.

## 64.9 Computed Attributes with `PyGetSetDef`

For attributes that need logic, use getters and setters.

```c
static PyObject *
Point_get_radius(PointObject *self, void *closure)
{
    double r = sqrt(self->x * self->x + self->y * self->y);
    return PyFloat_FromDouble(r);
}

static PyGetSetDef Point_getset[] = {
    {"radius", (getter)Point_get_radius, NULL, "distance from origin", NULL},
    {NULL}
};
```

This exposes:

```python
p = Point(3, 4)
print(p.radius)
```

Getters return a new reference.

Setters return:

| Return | Meaning |
|---|---|
| `0` | Success |
| `-1` | Failure |

A getter-only entry behaves like a read-only property.

## 64.10 String Representation

`repr(obj)` is implemented with `tp_repr`.

```c
static PyObject *
Point_repr(PointObject *self)
{
    return PyUnicode_FromFormat(
        "Point(%R, %R)",
        PyFloat_FromDouble(self->x),
        PyFloat_FromDouble(self->y)
    );
}
```

A more careful implementation avoids leaking temporary float objects:

```c
static PyObject *
Point_repr(PointObject *self)
{
    return PyUnicode_FromFormat(
        "Point(%f, %f)",
        self->x,
        self->y
    );
}
```

Then assign:

```c
.tp_repr = (reprfunc)Point_repr,
```

A good `repr` should be precise, stable, and useful for debugging.

## 64.11 Numeric Protocol

A type can support numeric operations using `PyNumberMethods`.

Example:

```c
static PyObject *
Point_add(PyObject *a, PyObject *b)
{
    PointObject *pa;
    PointObject *pb;

    if (!PyObject_TypeCheck(a, &PointType) ||
        !PyObject_TypeCheck(b, &PointType)) {
        Py_RETURN_NOTIMPLEMENTED;
    }

    pa = (PointObject *)a;
    pb = (PointObject *)b;

    return PyObject_CallFunction(
        (PyObject *)&PointType,
        "dd",
        pa->x + pb->x,
        pa->y + pb->y
    );
}

static PyNumberMethods Point_as_number = {
    .nb_add = Point_add,
};
```

Attach it:

```c
.tp_as_number = &Point_as_number,
```

Python usage:

```python
p = Point(1, 2)
q = Point(3, 4)
print(p + q)
```

Returning `Py_RETURN_NOTIMPLEMENTED` allows Python to try reflected operations or raise a proper `TypeError`.

## 64.12 Sequence Protocol

A type can behave like a sequence.

```c
static Py_ssize_t
Point_len(PointObject *self)
{
    return 2;
}

static PyObject *
Point_item(PointObject *self, Py_ssize_t i)
{
    if (i == 0) {
        return PyFloat_FromDouble(self->x);
    }
    if (i == 1) {
        return PyFloat_FromDouble(self->y);
    }

    PyErr_SetString(PyExc_IndexError, "index out of range");
    return NULL;
}

static PySequenceMethods Point_as_sequence = {
    .sq_length = (lenfunc)Point_len,
    .sq_item = (ssizeargfunc)Point_item,
};
```

Python usage:

```python
p = Point(10, 20)
len(p)
p[0]
p[1]
```

Attach:

```c
.tp_as_sequence = &Point_as_sequence,
```

## 64.13 Mapping Protocol

Mapping behavior is controlled through `PyMappingMethods`.

Useful when the object behaves like:

```python
obj[key]
obj[key] = value
len(obj)
```

Fields include:

| Slot | Meaning |
|---|---|
| `mp_length` | `len(obj)` |
| `mp_subscript` | `obj[key]` |
| `mp_ass_subscript` | `obj[key] = value` and `del obj[key]` |

Sequence and mapping protocols overlap in some cases. CPython chooses behavior according to slot availability and operation semantics.

## 64.14 Iterator Protocol

An object is iterable if its type defines `tp_iter`.

An iterator also defines `tp_iternext`.

For a container:

```c
.tp_iter = Point_iter
```

For an iterator object:

```c
.tp_iternext = PointIter_next
```

A simple iterator must return a new reference for each yielded item. When exhausted, it returns `NULL` without setting an exception, or with `StopIteration` depending on the helper path used.

Python-level behavior:

```python
for item in obj:
    ...
```

maps to:

```text
iter(obj)
next(iterator)
next(iterator)
...
StopIteration
```

## 64.15 Attribute Lookup

By default, extension types use generic attribute access.

```c
.tp_getattro = PyObject_GenericGetAttr,
.tp_setattro = PyObject_GenericSetAttr,
```

Generic attribute lookup supports:

```text
members
getset descriptors
methods
class attributes
descriptors
instance dictionary if enabled
```

If a type needs special lookup behavior, it can provide custom `tp_getattro` and `tp_setattro`.

Most extension types should use generic lookup unless they need unusual semantics.

## 64.16 Instance Dictionaries

A native type does not automatically have a per-instance `__dict__`.

To support dynamic attributes, the object struct needs space for a dict pointer:

```c
typedef struct {
    PyObject_HEAD
    PyObject *dict;
    double x;
    double y;
} PointObject;
```

Then set the type offset field:

```c
.tp_dictoffset = offsetof(PointObject, dict),
```

Deallocation must release it:

```c
Py_XDECREF(self->dict);
```

Without this, only declared members, getsets, methods, and class attributes are available.

## 64.17 Weak References

A native type does not automatically support weak references.

To enable weakrefs, add a weakref list field:

```c
typedef struct {
    PyObject_HEAD
    PyObject *weakreflist;
    double x;
    double y;
} PointObject;
```

Set:

```c
.tp_weaklistoffset = offsetof(PointObject, weakreflist),
```

In deallocation, clear weakrefs before freeing:

```c
if (self->weakreflist != NULL) {
    PyObject_ClearWeakRefs((PyObject *)self);
}
```

This enables:

```python
import weakref
r = weakref.ref(p)
```

## 64.18 Garbage Collector Support

If instances can hold references to other Python objects, they may participate in reference cycles.

Example:

```c
typedef struct {
    PyObject_HEAD
    PyObject *callback;
} WatcherObject;
```

Such a type should support cyclic GC.

Required pieces:

```c
static int
Watcher_traverse(WatcherObject *self, visitproc visit, void *arg)
{
    Py_VISIT(self->callback);
    return 0;
}

static int
Watcher_clear(WatcherObject *self)
{
    Py_CLEAR(self->callback);
    return 0;
}
```

Type flags and slots:

```c
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
.tp_traverse = (traverseproc)Watcher_traverse,
.tp_clear = (inquiry)Watcher_clear,
```

Allocation and free must use GC-aware functions:

```c
.tp_alloc = PyType_GenericAlloc,
.tp_free = PyObject_GC_Del,
```

During deallocation:

```c
PyObject_GC_UnTrack(self);
Watcher_clear(self);
Py_TYPE(self)->tp_free((PyObject *)self);
```

## 64.19 Type Readiness

Before adding a type to a module, call:

```c
if (PyType_Ready(&PointType) < 0) {
    return NULL;
}
```

This finalizes inherited slots, method descriptors, flags, base classes, and internal runtime metadata.

Then add it to the module:

```c
Py_INCREF(&PointType);
if (PyModule_AddObject(module, "Point", (PyObject *)&PointType) < 0) {
    Py_DECREF(&PointType);
    Py_DECREF(module);
    return NULL;
}
```

After this, Python can import and instantiate it:

```python
from geometry import Point

p = Point(1, 2)
```

## 64.20 Static Types vs Heap Types

Older extensions often define static types:

```c
static PyTypeObject PointType = { ... };
```

Static types live for the lifetime of the process.

Modern extension design often prefers heap types created from `PyType_Spec`.

Heap types work better with:

```text
subinterpreters
module state
multi-phase initialization
runtime isolation
cleaner finalization
```

Static types are simpler for learning and small modules. Heap types are usually better for robust modern extensions.

## 64.21 `PyType_Spec` and Heap Types

A heap type can be defined using slots.

```c
static PyType_Slot Point_slots[] = {
    {Py_tp_new, Point_new},
    {Py_tp_init, Point_init},
    {Py_tp_dealloc, Point_dealloc},
    {Py_tp_methods, Point_methods},
    {Py_tp_members, Point_members},
    {0, NULL}
};

static PyType_Spec Point_spec = {
    .name = "geometry.Point",
    .basicsize = sizeof(PointObject),
    .itemsize = 0,
    .flags = Py_TPFLAGS_DEFAULT,
    .slots = Point_slots,
};
```

Create the type:

```c
PyObject *PointType =
    PyType_FromSpec(&Point_spec);
```

Heap types are normal Python objects managed by the runtime. They can carry state more cleanly and fit better with multi-phase module initialization.

## 64.22 Inheritance

Extension types can support subclassing.

Set:

```c
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
```

Then Python code can do:

```python
class ColoredPoint(Point):
    pass
```

If subclassing is allowed, extension code must be careful about allocation, initialization, deallocation, and assumptions about exact type.

Use exact checks only when needed:

```c
Py_IS_TYPE(obj, &PointType)
```

Use subtype-aware checks when subclasses are valid:

```c
PyObject_TypeCheck(obj, &PointType)
```

## 64.23 Full Minimal Example

```c
#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <structmember.h>
#include <math.h>

typedef struct {
    PyObject_HEAD
    double x;
    double y;
} PointObject;

static PyObject *
Point_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PointObject *self;

    self = (PointObject *)type->tp_alloc(type, 0);
    if (self == NULL) {
        return NULL;
    }

    self->x = 0.0;
    self->y = 0.0;

    return (PyObject *)self;
}

static int
Point_init(PointObject *self, PyObject *args, PyObject *kwds)
{
    static char *kwlist[] = {"x", "y", NULL};

    if (!PyArg_ParseTupleAndKeywords(
            args,
            kwds,
            "dd",
            kwlist,
            &self->x,
            &self->y)) {
        return -1;
    }

    return 0;
}

static void
Point_dealloc(PointObject *self)
{
    Py_TYPE(self)->tp_free((PyObject *)self);
}

static PyObject *
Point_length(PointObject *self, PyObject *Py_UNUSED(ignored))
{
    double len = sqrt(self->x * self->x + self->y * self->y);
    return PyFloat_FromDouble(len);
}

static PyObject *
Point_repr(PointObject *self)
{
    return PyUnicode_FromFormat(
        "Point(%f, %f)",
        self->x,
        self->y
    );
}

static PyMemberDef Point_members[] = {
    {"x", T_DOUBLE, offsetof(PointObject, x), 0, "x coordinate"},
    {"y", T_DOUBLE, offsetof(PointObject, y), 0, "y coordinate"},
    {NULL}
};

static PyMethodDef Point_methods[] = {
    {"length", (PyCFunction)Point_length, METH_NOARGS, "Return vector length"},
    {NULL}
};

static PyTypeObject PointType = {
    PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "geometry.Point",
    .tp_basicsize = sizeof(PointObject),
    .tp_itemsize = 0,
    .tp_dealloc = (destructor)Point_dealloc,
    .tp_repr = (reprfunc)Point_repr,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_doc = "2D point",
    .tp_methods = Point_methods,
    .tp_members = Point_members,
    .tp_init = (initproc)Point_init,
    .tp_new = Point_new,
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "geometry",
    "Geometry extension module",
    -1,
    NULL
};

PyMODINIT_FUNC
PyInit_geometry(void)
{
    PyObject *m;

    if (PyType_Ready(&PointType) < 0) {
        return NULL;
    }

    m = PyModule_Create(&module);
    if (m == NULL) {
        return NULL;
    }

    Py_INCREF(&PointType);
    if (PyModule_AddObject(m, "Point", (PyObject *)&PointType) < 0) {
        Py_DECREF(&PointType);
        Py_DECREF(m);
        return NULL;
    }

    return m;
}
```

Python usage:

```python
from geometry import Point

p = Point(3, 4)

print(p)
print(p.x)
print(p.y)
print(p.length())
```

## 64.24 Common Mistakes

| Mistake | Consequence |
|---|---|
| Forgetting `PyObject_HEAD` | Invalid object layout |
| Using `malloc` instead of `tp_alloc` | Broken runtime integration |
| Forgetting `PyType_Ready` | Type not initialized |
| Returning borrowed references from getters | Use-after-free |
| Missing `Py_DECREF` in dealloc | Leaks |
| Missing GC support for object references | Uncollectable cycles |
| Allowing subclassing accidentally | Unsafe layout assumptions |
| Calling Python APIs after object teardown starts | Shutdown crashes |

## 64.25 Chapter Summary

Defining a new CPython type means defining a C structure for instance layout and a `PyTypeObject` or `PyType_Spec` for runtime behavior.

The object structure stores data. The type object stores behavior. Slots connect Python operations to C functions.

A well-formed extension type handles allocation, initialization, deallocation, methods, attributes, representation, protocols, reference ownership, garbage collection, weak references, subclassing, and module registration.

This mechanism is the foundation of CPython’s built-in types and native extension ecosystem.
