# 61. The Python C API

# 61. The Python C API

The Python C API is the native programming interface exposed by CPython. It allows C programs to create Python objects, execute Python code, define extension modules, implement new object types, call Python functions, manipulate interpreter state, and embed the Python runtime inside larger applications.

The C API is one of the central architectural features of CPython. It defines the boundary between the interpreter runtime and native machine code.

Most high-performance Python libraries depend on it directly or indirectly:

| Project | Use of C API |
|---|---|
| [NumPy](chatgpt://generic-entity?number=0) | Array objects and vectorized kernels |
| [pandas](chatgpt://generic-entity?number=1) | DataFrame internals and fast parsing |
| [lxml](chatgpt://generic-entity?number=2) | XML parser bindings |
| [Pillow](chatgpt://generic-entity?number=3) | Image codecs and pixel operations |
| [psycopg](chatgpt://generic-entity?number=4) | Database driver integration |
| [PyTorch](chatgpt://generic-entity?number=5) | Tensor runtime and Python bindings |

Without the C API, CPython would mainly be an interpreter. With the C API, CPython becomes a systems integration platform.

## 61.1 What the C API Provides

The API exposes operations for:

```text
creating Python objects
accessing object attributes
calling Python functions
implementing new types
raising exceptions
managing memory
interacting with threads
executing Python code
importing modules
embedding interpreters
extending the runtime
```

The API is primarily declared in header files under:

```text
Include/
```

Core headers include:

| Header | Purpose |
|---|---|
| `Python.h` | Main public entry point |
| `object.h` | Core object structures |
| `unicodeobject.h` | Unicode APIs |
| `listobject.h` | List APIs |
| `dictobject.h` | Dictionary APIs |
| `tupleobject.h` | Tuple APIs |
| `moduleobject.h` | Module APIs |
| `cpython/` | CPython-specific internal declarations |

Almost every extension starts with:

```c
#include <Python.h>
```

This header pulls in the public API surface and platform abstractions.

## 61.2 CPython’s Runtime Model at the C Level

At the C level, every Python value is represented by a pointer to `PyObject`.

Conceptually:

```c
typedef struct _object {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;
```

All objects begin with this header.

This means the C API operates almost entirely on object pointers:

```c
PyObject *
```

A Python integer:

```python
x = 42
```

becomes something like:

```text
PyLongObject *
```

A Python string becomes:

```text
PyUnicodeObject *
```

A Python list becomes:

```text
PyListObject *
```

But all can be treated generically as:

```text
PyObject *
```

This is the foundation of CPython polymorphism.

## 61.3 The Central Role of `PyObject`

The API is object-oriented in C.

Every operation accepts or returns `PyObject *`.

Examples:

```c
PyObject *PyLong_FromLong(long v);
PyObject *PyUnicode_FromString(const char *s);
PyObject *PyObject_CallObject(PyObject *callable, PyObject *args);
```

This style gives the API several properties:

| Property | Meaning |
|---|---|
| Dynamic typing | Object type known at runtime |
| Uniform interface | Same pointer abstraction everywhere |
| Extensibility | New types integrate naturally |
| Runtime dispatch | Behavior controlled by type object |

The type object determines behavior:

```text
addition
comparison
attribute lookup
iteration
calling
hashing
buffer support
memory layout
```

Internally this resembles a manually constructed object system implemented in C structures and function pointers.

## 61.4 The Include Hierarchy

The public API is divided into layers.

### Stable public API

Headers safe for extension authors:

```text
Include/
```

### CPython-specific API

Implementation details:

```text
Include/cpython/
```

### Internal runtime API

Private interpreter internals:

```text
Include/internal/
```

The distinction matters because many fields and functions are not ABI stable.

Example:

```c
PyObject *
```

is public.

But direct manipulation of interpreter internals may require:

```c
#include "internal/pycore_runtime.h"
```

which is unsupported outside CPython itself.

## 61.5 Building an Extension Module

A native extension is usually a shared library loaded by CPython at runtime.

Typical filenames:

| Platform | Extension |
|---|---|
| Linux | `.so` |
| macOS | `.so` |
| Windows | `.pyd` |

Minimal module:

```c
#include <Python.h>

static PyObject *
hello(PyObject *self, PyObject *args)
{
    printf("hello from C\n");
    Py_RETURN_NONE;
}

static PyMethodDef Methods[] = {
    {"hello", hello, METH_NOARGS, "Print hello"},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "demo",
    NULL,
    -1,
    Methods
};

PyMODINIT_FUNC
PyInit_demo(void)
{
    return PyModule_Create(&module);
}
```

After compilation:

```python
import demo
demo.hello()
```

calls directly into native machine code.

The interpreter dynamically loads the shared library, resolves `PyInit_demo`, and registers the module object.

## 61.6 `PyMethodDef`

Functions exported to Python are declared using `PyMethodDef`.

Structure:

```c
typedef struct PyMethodDef {
    const char  *ml_name;
    PyCFunction  ml_meth;
    int          ml_flags;
    const char  *ml_doc;
} PyMethodDef;
```

Fields:

| Field | Meaning |
|---|---|
| `ml_name` | Python-visible name |
| `ml_meth` | C function pointer |
| `ml_flags` | Calling convention |
| `ml_doc` | Docstring |

Example:

```c
{"add", add, METH_VARARGS, "Add two numbers"}
```

This binds Python-level names to native implementations.

## 61.7 Calling Conventions

CPython supports multiple calling conventions.

### `METH_NOARGS`

No Python arguments:

```c
static PyObject *
f(PyObject *self, PyObject *unused)
```

### `METH_VARARGS`

Tuple-based arguments:

```c
static PyObject *
f(PyObject *self, PyObject *args)
```

### `METH_KEYWORDS`

Positional and keyword arguments:

```c
static PyObject *
f(PyObject *self,
  PyObject *args,
  PyObject *kwargs)
```

### `METH_FASTCALL`

Modern fast calling convention.

Avoids temporary tuple creation.

The interpreter heavily optimizes this path in modern CPython.

## 61.8 Argument Parsing

CPython converts Python arguments into C values using parsing helpers.

Example:

```c
static PyObject *
add(PyObject *self, PyObject *args)
{
    int a;
    int b;

    if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
        return NULL;
    }

    return PyLong_FromLong(a + b);
}
```

Format string:

```text
"ii"
```

means:

```text
parse two integers
```

Common format codes:

| Code | Meaning |
|---|---|
| `i` | int |
| `l` | long |
| `d` | double |
| `s` | UTF-8 string |
| `O` | generic Python object |
| `p` | boolean |

If parsing fails:

```text
return NULL
```

and the interpreter propagates the exception.

## 61.9 Exceptions in the C API

Exceptions are represented implicitly.

A C function signals failure by:

```text
returning NULL
```

and setting interpreter exception state.

Example:

```c
PyErr_SetString(PyExc_ValueError,
                "invalid value");

return NULL;
```

The current thread state stores the active exception.

Conceptually:

```text
thread state
    current exception type
    current exception value
    traceback
```

This differs from C++ exceptions.

The API uses explicit return-value checking.

Typical pattern:

```c
PyObject *obj = some_call();

if (obj == NULL) {
    return NULL;
}
```

Failure propagation is manual.

## 61.10 Reference Counting in the C API

Reference ownership is the most important rule in the C API.

Each `PyObject *` has a reference count.

Operations either:

```text
create references
borrow references
steal references
destroy references
```

Core macros:

```c
Py_INCREF(obj);
Py_DECREF(obj);
Py_XDECREF(obj);
```

Reference bugs are the most common source of extension crashes.

### Example

```c
PyObject *x = PyLong_FromLong(42);
```

returns a new reference.

You own it.

Eventually:

```c
Py_DECREF(x);
```

must happen.

Otherwise the object leaks.

## 61.11 Borrowed vs New References

The API distinguishes ownership explicitly.

| Type | Meaning |
|---|---|
| New reference | Caller owns reference |
| Borrowed reference | Caller does not own reference |
| Stolen reference | Ownership transferred |

Example:

```c
PyObject *item = PyList_GetItem(list, 0);
```

returns a borrowed reference.

Do not decref it unless you first incref it.

Example:

```c
PyList_SetItem(list, i, obj);
```

steals a reference.

The list now owns the object reference.

Misunderstanding ownership rules causes:

```text
memory leaks
double frees
use-after-free
dangling pointers
interpreter corruption
```

## 61.12 Type Checking

CPython exposes runtime type checks.

Examples:

```c
PyLong_Check(obj)
PyUnicode_Check(obj)
PyList_Check(obj)
PyDict_Check(obj)
```

These validate runtime object types before operations.

Example:

```c
if (!PyLong_Check(obj)) {
    PyErr_SetString(PyExc_TypeError,
                    "expected int");
    return NULL;
}
```

Most APIs assume correct object types.

Incorrect assumptions may crash the interpreter.

## 61.13 Creating Python Objects

The API provides constructors for built-in types.

### Integers

```c
PyObject *x = PyLong_FromLong(123);
```

### Floats

```c
PyObject *x = PyFloat_FromDouble(3.14);
```

### Unicode

```c
PyObject *x = PyUnicode_FromString("hello");
```

### Lists

```c
PyObject *list = PyList_New(0);
```

### Dicts

```c
PyObject *dict = PyDict_New();
```

All return heap-allocated Python objects.

Reference ownership rules apply immediately.

## 61.14 Calling Python from C

The API allows native code to invoke Python callables.

Example:

```c
PyObject *result =
    PyObject_CallObject(func, args);
```

Variants:

| Function | Purpose |
|---|---|
| `PyObject_CallObject` | Generic call |
| `PyObject_CallFunction` | Format-string call |
| `PyObject_Vectorcall` | Fast modern call |
| `PyObject_CallMethod` | Call named method |

This enables hybrid execution:

```text
Python code
    ↓
C extension
    ↓
Python callback
    ↓
more native code
```

Many scientific libraries use this extensively.

## 61.15 Attribute Access

Attributes can be manipulated directly.

### Get attribute

```c
PyObject *x =
    PyObject_GetAttrString(obj, "name");
```

### Set attribute

```c
PyObject_SetAttrString(obj,
                       "name",
                       value);
```

### Check attribute

```c
PyObject_HasAttrString(obj, "name");
```

This uses the normal Python attribute lookup system:

```text
instance dict
class dict
descriptors
MRO traversal
```

The C API interacts with the same semantics as Python code.

## 61.16 Importing Modules

Modules can be imported from C.

Example:

```c
PyObject *math =
    PyImport_ImportModule("math");
```

Access function:

```c
PyObject *sqrt =
    PyObject_GetAttrString(math, "sqrt");
```

Call function:

```c
PyObject *args =
    PyTuple_Pack(1,
                 PyFloat_FromDouble(9.0));

PyObject *result =
    PyObject_CallObject(sqrt, args);
```

This allows embedded runtimes to drive Python dynamically.

## 61.17 Embedding Python

The API supports embedding CPython inside native programs.

Initialization:

```c
Py_Initialize();
```

Execute code:

```c
PyRun_SimpleString(
    "print('hello from embedded python')"
);
```

Shutdown:

```c
Py_Finalize();
```

Applications using embedding include:

| Category | Example |
|---|---|
| Game engines | Scripting systems |
| Scientific software | User automation |
| Databases | Stored procedures |
| Editors | Plugin systems |
| Network appliances | Embedded configuration |

Embedding reverses the normal relationship:

```text
normal:
    Python → C extension

embedding:
    C application → embedded Python
```

## 61.18 The Global Interpreter Lock in the API

Thread interaction requires GIL management.

Many APIs require the calling thread to hold the GIL.

Acquire:

```c
PyGILState_Ensure();
```

Release:

```c
PyGILState_Release();
```

Long-running native code can release the GIL:

```c
Py_BEGIN_ALLOW_THREADS

long_native_operation();

Py_END_ALLOW_THREADS
```

This allows parallel native execution while suspending Python bytecode execution for that thread.

## 61.19 Stable ABI and Limited API

CPython exposes two compatibility layers.

### Full C API

Direct access to CPython internals.

Highest performance.

Least stable ABI.

### Limited API

Restricted API subset.

Stable across Python versions.

Used with:

```c
#define Py_LIMITED_API
```

Extensions targeting the stable ABI can ship one wheel compatible across multiple Python versions.

Tradeoff:

| Full API | Limited API |
|---|---|
| Maximum speed | Greater compatibility |
| Access to internals | Restricted features |
| Tighter coupling | ABI stability |

## 61.20 Internal vs Public APIs

Not all APIs are public.

CPython internally uses:

```text
_PyRuntime
_PyInterpreterState
_PyEval_EvalFrameDefault
_PyObject_Vectorcall
```

Many internal functions begin with:

```text
_Py
```

These are implementation details.

They may change between releases without compatibility guarantees.

Extension authors should avoid depending on internal APIs unless absolutely necessary.

## 61.21 Memory Allocators

The API exposes custom memory allocation layers.

General allocator:

```c
PyMem_Malloc
PyMem_Realloc
PyMem_Free
```

Object allocator:

```c
PyObject_Malloc
PyObject_Free
```

CPython internally uses specialized allocators such as:

```text
pymalloc
arena allocators
free lists
small-object pools
```

Memory allocation strategy strongly affects interpreter performance.

## 61.22 Error Handling Philosophy

The C API uses explicit error handling everywhere.

Most APIs follow this pattern:

| Return value | Meaning |
|---|---|
| non-NULL | success |
| NULL | exception occurred |

or:

| Return value | Meaning |
|---|---|
| `0` | success |
| `-1` | failure |

The interpreter never assumes success automatically.

This makes CPython code verbose but predictable.

Typical pattern:

```c
obj = PyObject_Call(...);

if (obj == NULL) {
    return NULL;
}
```

## 61.23 CPython’s API Design Style

The API reflects CPython’s history.

Characteristics:

| Property | Description |
|---|---|
| Manual memory management | Explicit ownership |
| C89 compatibility origins | Historical portability |
| Macro-heavy design | Performance-oriented |
| Runtime polymorphism | Type-object dispatch |
| Explicit error propagation | No hidden exceptions |
| Refcount semantics | Central ownership model |

The API prioritizes:

```text
performance
interpreter integration
portability
backward compatibility
incremental evolution
```

rather than modern language ergonomics.

## 61.24 Relationship Between Python Semantics and the C API

The API mirrors Python semantics closely.

Python operation:

```python
x + y
```

maps internally to:

```text
PyNumber_Add
```

Attribute access:

```python
obj.name
```

maps to:

```text
PyObject_GetAttr
```

Function calls:

```python
f(a, b)
```

map to:

```text
PyObject_Call
```

The C API is effectively a low-level interface to the interpreter’s object protocol system.

## 61.25 Why the C API Matters

The C API defines much of CPython’s ecosystem architecture.

It enables:

```text
scientific computing
GPU runtimes
database bindings
cryptography
operating system integration
network libraries
language bindings
high-performance parsers
embedded scripting
```

It also constrains CPython evolution.

Because many extensions depend on:

```text
reference counts
object layout
GIL behavior
type object structure
calling conventions
```

major runtime changes require compatibility strategies.

This tension shapes many modern CPython engineering decisions.

## 61.26 Chapter Summary

The Python C API is the native interface to the CPython runtime. It exposes object manipulation, type systems, memory management, function calls, module creation, interpreter control, and embedding capabilities through a large C-level interface centered on `PyObject *`.

The API operates through explicit ownership rules, runtime polymorphism, manual reference counting, and explicit error propagation. Extension modules use it to integrate native machine code with Python semantics, while embedded applications use it to host the interpreter inside larger systems.

Understanding the C API is essential for studying extension modules, runtime internals, object implementation, memory management, interpreter execution, and CPython ecosystem architecture.
