65. Buffer Protocol

The buffer protocol is CPython’s low-level interface for sharing raw memory between Python objects without copying. It allows one object to expose a contiguous or strided block of memory, and another object to read or write that memory through a common C structure.

The protocol is used by objects such as:

Object	Buffer use
`bytes`	Read-only contiguous byte storage
`bytearray`	Mutable contiguous byte storage
`memoryview`	General Python-level buffer view
`array.array`	Typed contiguous storage
`mmap.mmap`	Memory-mapped file storage
`numpy.ndarray`	Typed, shaped, strided memory
extension objects	Custom binary storage

The buffer protocol is one of the main reasons Python can interoperate efficiently with binary data, numerical arrays, images, files, sockets, compression libraries, codecs, and native extensions.

65.1 Why the Buffer Protocol Exists

Python objects usually hide their internal representation. A bytes object, a bytearray, an image buffer, and a NumPy array all have different implementation details.

But native code often needs direct access to memory:

hash this byte range
compress this block
decode this image
write this array to a file
pass this tensor to native code
parse this packet without copying

Without a common protocol, each library would need custom APIs for each object type.

The buffer protocol provides one uniform view:

Python object
    exposes memory
        through Py_buffer
            consumed by C code

This lets native code operate on many object types through the same interface.

65.2 Exporters and Consumers

The protocol has two sides.

Role	Meaning
Exporter	Object that exposes memory
Consumer	Code that requests and uses memory

Examples:

Exporter	Consumer
`bytes`	hashing function
`bytearray`	compression library
`array.array`	binary writer
`mmap.mmap`	parser
NumPy array	native numerical kernel
custom extension type	Python `memoryview`

A consumer asks an exporter for a view. The exporter fills a Py_buffer structure. The consumer uses it. When finished, the consumer releases it.

consumer
    PyObject_GetBuffer(obj, &view, flags)
        exporter fills Py_buffer
    use view.buf, view.len, shape, strides
    PyBuffer_Release(&view)

65.3 `Py_buffer`

The central structure is Py_buffer.

Conceptually:

typedef struct {
    void *buf;
    PyObject *obj;
    Py_ssize_t len;
    Py_ssize_t itemsize;
    int readonly;
    int ndim;
    char *format;
    Py_ssize_t *shape;
    Py_ssize_t *strides;
    Py_ssize_t *suboffsets;
    void *internal;
} Py_buffer;

Important fields:

Field	Meaning
`buf`	Pointer to first accessible byte
`obj`	Exporting object
`len`	Total logical byte length
`itemsize`	Size of one element
`readonly`	Whether writes are forbidden
`ndim`	Number of dimensions
`format`	Element format string
`shape`	Length of each dimension
`strides`	Byte step per dimension
`suboffsets`	Indirect buffer support
`internal`	Exporter-private data

A simple byte buffer may use only buf, len, readonly, and obj.

A multidimensional array needs ndim, itemsize, format, shape, and strides.

65.4 Simple Contiguous Buffers

A bytes object exposes a read-only contiguous buffer.

data = b"hello"
view = memoryview(data)
print(view.readonly)
print(view.nbytes)

At the C level:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

/* view.buf points to bytes */
 /* view.len is byte length */

PyBuffer_Release(&view);

PyBUF_SIMPLE requests a simple byte-oriented buffer.

The consumer should treat the memory as a flat array of bytes.

65.5 Writable Buffers

Some objects expose mutable memory.

Example:

data = bytearray(b"hello")
view = memoryview(data)
view[0] = ord("H")
print(data)

Native code can request a writable buffer:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_WRITABLE) < 0) {
    return NULL;
}

char *p = (char *)view.buf;
p[0] = 'H';

PyBuffer_Release(&view);

If the exporter is read-only, the request fails and sets an exception.

This prevents code from mutating immutable objects such as bytes.

65.6 Buffer Flags

Consumers specify what kind of view they need.

Common flags:

Flag	Meaning
`PyBUF_SIMPLE`	Flat byte buffer
`PyBUF_WRITABLE`	Writable buffer required
`PyBUF_FORMAT`	Request element format string
`PyBUF_ND`	Request dimensionality and shape
`PyBUF_STRIDES`	Request strides
`PyBUF_C_CONTIGUOUS`	Require C-contiguous layout
`PyBUF_F_CONTIGUOUS`	Require Fortran-contiguous layout
`PyBUF_ANY_CONTIGUOUS`	Require any contiguous layout
`PyBUF_FULL`	Request full buffer information

A consumer should request the weakest view it needs.

For example, a hashing function only needs bytes:

PyBUF_SIMPLE

A numerical kernel may require:

PyBUF_FORMAT | PyBUF_ND | PyBUF_STRIDES

A C library requiring flat contiguous memory should ask for contiguity explicitly.

65.7 Contiguous vs Strided Memory

Not all buffers are contiguous.

A one-dimensional contiguous buffer:

[ a b c d e f ]

has one linear memory range.

A strided view may skip bytes:

[ a _ b _ c _ d _ ]

A two-dimensional array can have row strides:

row 0: a b c
row 1: d e f
row 2: g h i

C-contiguous layout stores rows next to each other:

a b c d e f g h i

Fortran-contiguous layout stores columns next to each other:

a d g b e h c f i

The buffer protocol represents this using:

shape
strides
itemsize

65.8 Shape and Strides

For a two-dimensional array:

shape = [3, 4]
itemsize = 8

means:

3 rows
4 columns
8 bytes per element

Strides describe how many bytes to move to advance along each dimension.

C-contiguous double array:

shape   = [3, 4]
strides = [32, 8]

because:

next row    = 4 elements * 8 bytes = 32 bytes
next column = 1 element * 8 bytes = 8 bytes

Element address:

address(i, j) = buf + i * strides[0] + j * strides[1]

This lets one protocol describe compact arrays, slices, transposes, channels, images, and tensor-like data.

65.9 Format Strings

The format field describes the type of each element.

Examples:

Format	Meaning
`B`	unsigned byte
`b`	signed byte
`h`	short
`i`	int
`l`	long
`f`	float
`d`	double

A consumer that cares about element type should request PyBUF_FORMAT and validate it.

Example:

if (view.itemsize != sizeof(double) ||
    view.format == NULL ||
    strcmp(view.format, "d") != 0) {
    PyBuffer_Release(&view);
    PyErr_SetString(PyExc_TypeError, "expected double buffer");
    return NULL;
}

Do not assume a buffer contains a particular type unless the protocol data confirms it.

65.10 `memoryview`

memoryview is the Python-level object for inspecting and slicing buffers.

data = bytearray(b"abcdef")
v = memoryview(data)

print(v[0])
print(v[1:4])

memoryview does not copy the underlying memory. It references the exporter.

This matters:

data = bytearray(b"abc")
v = memoryview(data)

v[0] = ord("A")
print(data)

The output:

bytearray(b'Abc')

The view modifies the original object.

65.11 Lifetime Rules

A buffer view must keep the exporter alive.

In Py_buffer, the obj field stores a reference to the exporting object. The consumer must release the buffer:

PyBuffer_Release(&view);

This call releases exporter-owned state and decrements the reference held by the view.

Common bug:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

/* use view */

return PyLong_FromLong(view.len);  /* missing PyBuffer_Release */

Correct:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

PyObject *result = PyLong_FromSsize_t(view.len);

PyBuffer_Release(&view);

return result;

Every successful PyObject_GetBuffer must have a matching PyBuffer_Release.

65.12 Exporter Restrictions During Active Views

An exporter must not invalidate memory while consumers hold active views.

For example, a bytearray cannot be resized while exported buffers exist:

data = bytearray(b"abc")
v = memoryview(data)

data.append(100)

This raises an error because resizing might move memory and invalidate the view.

Custom exporters must obey the same principle. Once they export a buffer, they must keep the memory valid until the consumer releases it.

65.13 Writing a Buffer Consumer

A simple consumer that sums bytes:

static PyObject *
sum_bytes(PyObject *self, PyObject *args)
{
    PyObject *obj;
    Py_buffer view;

    if (!PyArg_ParseTuple(args, "O", &obj)) {
        return NULL;
    }

    if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
        return NULL;
    }

    unsigned char *p = (unsigned char *)view.buf;
    Py_ssize_t total = 0;

    for (Py_ssize_t i = 0; i < view.len; i++) {
        total += p[i];
    }

    PyBuffer_Release(&view);

    return PyLong_FromSsize_t(total);
}

Python usage:

sum_bytes(b"abc")
sum_bytes(bytearray(b"abc"))
sum_bytes(memoryview(b"abc"))

The same C function works with many exporters.

65.14 Handling Errors in Buffer Consumers

Always release the buffer on every path after acquisition.

static PyObject *
first_byte(PyObject *self, PyObject *args)
{
    PyObject *obj;
    Py_buffer view;

    if (!PyArg_ParseTuple(args, "O", &obj)) {
        return NULL;
    }

    if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
        return NULL;
    }

    if (view.len == 0) {
        PyBuffer_Release(&view);
        PyErr_SetString(PyExc_ValueError, "empty buffer");
        return NULL;
    }

    unsigned char value = ((unsigned char *)view.buf)[0];

    PyBuffer_Release(&view);

    return PyLong_FromUnsignedLong(value);
}

The pattern mirrors reference cleanup.

65.15 Requiring Contiguous Memory

Some C libraries require a single contiguous memory block.

Ask explicitly:

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

if (!PyBuffer_IsContiguous(&view, 'C')) {
    PyBuffer_Release(&view);
    PyErr_SetString(PyExc_BufferError, "expected C-contiguous buffer");
    return NULL;
}

For non-contiguous input, consumers can either reject it or copy it into a contiguous buffer.

Rejecting is simpler. Copying is more flexible.

65.16 Copying from Non-Contiguous Buffers

CPython provides helpers for copying buffer data into contiguous storage.

Conceptual pattern:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_FULL_RO) < 0) {
    return NULL;
}

char *copy = PyMem_Malloc(view.len);
if (copy == NULL) {
    PyBuffer_Release(&view);
    return PyErr_NoMemory();
}

if (PyBuffer_ToContiguous(copy, &view, view.len, 'C') < 0) {
    PyMem_Free(copy);
    PyBuffer_Release(&view);
    return NULL;
}

/* use copy */

PyMem_Free(copy);
PyBuffer_Release(&view);

This keeps the C library interface simple while accepting strided inputs.

65.17 Writing a Buffer Exporter

A custom type exports a buffer by implementing bf_getbuffer and bf_releasebuffer.

Example object:

typedef struct {
    PyObject_HEAD
    char *data;
    Py_ssize_t len;
    int exports;
} BlobObject;

Buffer methods:

static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
    if (view == NULL) {
        PyErr_SetString(PyExc_BufferError, "NULL view");
        return -1;
    }

    return PyBuffer_FillInfo(
        view,
        (PyObject *)self,
        self->data,
        self->len,
        0,
        flags
    );
}

static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
    /* optional exporter cleanup */
}

Attach through PyBufferProcs:

static PyBufferProcs Blob_bufferprocs = {
    .bf_getbuffer = (getbufferproc)Blob_getbuffer,
    .bf_releasebuffer = (releasebufferproc)Blob_releasebuffer,
};

Then in the type:

.tp_as_buffer = &Blob_bufferprocs,

Now Python can do:

b = Blob(...)
v = memoryview(b)

65.18 Tracking Active Exports

If an exporter owns resizable memory, it should track active exports.

static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
    int ret = PyBuffer_FillInfo(
        view,
        (PyObject *)self,
        self->data,
        self->len,
        0,
        flags
    );

    if (ret == 0) {
        self->exports++;
    }

    return ret;
}

static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
    self->exports--;
}

Before resizing:

if (self->exports > 0) {
    PyErr_SetString(
        PyExc_BufferError,
        "cannot resize while buffers are exported"
    );
    return NULL;
}

This prevents dangling pointers.

65.19 Read-Only Exporters

The readonly argument to PyBuffer_FillInfo controls writability.

Read-only:

PyBuffer_FillInfo(
    view,
    (PyObject *)self,
    self->data,
    self->len,
    1,
    flags
);

Writable:

PyBuffer_FillInfo(
    view,
    (PyObject *)self,
    self->data,
    self->len,
    0,
    flags
);

If a consumer requests PyBUF_WRITABLE from a read-only exporter, the request fails.

This is how immutable binary objects protect their storage.

65.20 Multidimensional Exporters

For arrays, exporters must provide shape, strides, item size, and format.

Example for a 2D double matrix:

rows = 3
cols = 4
itemsize = 8
shape = [3, 4]
strides = [32, 8]
format = "d"

The exporter must ensure that the shape and stride arrays remain valid while the view exists. They are often stored in the object itself or in exporter-private memory.

A full exporter must handle flag requests correctly. If the consumer asks for shape or format and the exporter cannot provide them, it should fail with BufferError.

65.21 Buffer Protocol and Zero Copy

The buffer protocol enables zero-copy paths.

Example:

socket reads bytes
    ↓
bytearray stores mutable memory
    ↓
memoryview slices without copy
    ↓
parser reads view
    ↓
native extension decodes fields

For large data, zero-copy can dominate performance.

Copying a 1 GB buffer costs both memory bandwidth and allocation overhead. Passing a view costs pointer setup and lifetime tracking.

This is why the protocol matters for:

images
audio
video
tensors
network packets
database pages
compressed blocks
memory-mapped files

65.22 Buffer Protocol and the GIL

Acquiring and releasing buffers touches Python objects, so it normally requires the GIL.

But after acquiring a stable buffer, native code may release the GIL while processing raw memory, if it does not call Python APIs and the exporter guarantees valid storage.

Pattern:

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

Py_BEGIN_ALLOW_THREADS

process_raw_memory(view.buf, view.len);

Py_END_ALLOW_THREADS

PyBuffer_Release(&view);

This allows CPU-bound native processing to run without blocking other Python threads from acquiring the GIL.

65.23 Buffer Protocol vs Sequence Protocol

The sequence protocol exposes elements as Python objects.

x = obj[i]

The buffer protocol exposes raw memory.

buf + offset

Comparison:

Feature	Sequence protocol	Buffer protocol
Access level	Python objects	Raw memory
Copy-free binary access	No	Yes
Type metadata	Python-level	Format string
Multidimensional layout	Indirect	Shape and strides
Use case	General containers	Binary and numerical data

A list of integers does not normally expose a buffer because its memory contains pointers to Python objects, not raw integer values.

An array.array("i") can expose a buffer because its memory stores raw C integers.

65.24 Buffer Protocol vs C API Type-Specific Access

Some APIs expose direct access to specific object types.

Example:

char *p = PyBytes_AS_STRING(obj);
Py_ssize_t n = PyBytes_GET_SIZE(obj);

This works only for bytes.

The buffer protocol works across many exporters.

Approach	Scope
`PyBytes_AS_STRING`	`bytes` only
`PyByteArray_AS_STRING`	`bytearray` only
`PyObject_GetBuffer`	Any buffer exporter

Prefer the buffer protocol when accepting generic binary data.

65.25 Common Buffer Bugs

Bug	Cause
Missing `PyBuffer_Release`	Exporter stays pinned
Writing to read-only memory	Missing writable check
Assuming contiguity	Ignoring strides
Assuming element type	Ignoring `format`
Resizing during export	Invalidates active views
Releasing GIL too early	Python API calls without GIL
Returning pointer after release	Dangling pointer
Storing `view.buf` long-term	Buffer lifetime violation

The safest rule: treat view.buf as valid only between successful PyObject_GetBuffer and matching PyBuffer_Release.

65.26 Practical Design Guidelines

For consumers:

Need	Request
Raw bytes only	`PyBUF_SIMPLE`
Must write	`PyBUF_WRITABLE`
Must know type	`PyBUF_FORMAT`
Must support arrays	`PyBUF_ND
Must call C library	Require contiguity or copy

For exporters:

Requirement	Rule
Memory can resize	Track active exports
Object stores references	Add GC support separately
Memory read-only	Mark buffer read-only
Multidimensional data	Provide stable shape and strides
Custom allocation	Keep storage valid until release

65.27 Chapter Summary

The buffer protocol is CPython’s common interface for exposing raw memory. It lets objects such as bytes, bytearray, memoryview, array.array, mmap, numerical arrays, and custom extension types share memory with native code without copying.

Consumers acquire a Py_buffer, use the memory and metadata, then release it. Exporters fill the view and keep memory valid while it is exported.

The protocol supports read-only memory, writable memory, contiguous buffers, multidimensional arrays, typed elements, strides, and zero-copy slicing. It is central to CPython’s performance story for binary data and native interoperability.

65. Buffer Protocol

65.1 Why the Buffer Protocol Exists

65.2 Exporters and Consumers

65.3 Py_buffer

65.4 Simple Contiguous Buffers

65.5 Writable Buffers

65.6 Buffer Flags

65.7 Contiguous vs Strided Memory

65.8 Shape and Strides

65.9 Format Strings

65.10 memoryview

65.11 Lifetime Rules

65.12 Exporter Restrictions During Active Views

65.13 Writing a Buffer Consumer

65.14 Handling Errors in Buffer Consumers

65.15 Requiring Contiguous Memory

65.16 Copying from Non-Contiguous Buffers

65.17 Writing a Buffer Exporter

65.18 Tracking Active Exports

65.19 Read-Only Exporters

65.20 Multidimensional Exporters

65.21 Buffer Protocol and Zero Copy

65.22 Buffer Protocol and the GIL

65.23 Buffer Protocol vs Sequence Protocol

65.24 Buffer Protocol vs C API Type-Specific Access

65.25 Common Buffer Bugs

65.26 Practical Design Guidelines

65.27 Chapter Summary

65.3 `Py_buffer`

65.10 `memoryview`