Skip to content

65. Buffer Protocol

Py_buffer, PyBUF_* flags, PyObject_GetBuffer, and implementing the buffer protocol in a custom type.

The buffer protocol is CPython’s low-level interface for sharing raw memory between Python objects without copying. It allows one object to expose a contiguous or strided block of memory, and another object to read or write that memory through a common C structure.

The protocol is used by objects such as:

ObjectBuffer use
bytesRead-only contiguous byte storage
bytearrayMutable contiguous byte storage
memoryviewGeneral Python-level buffer view
array.arrayTyped contiguous storage
mmap.mmapMemory-mapped file storage
numpy.ndarrayTyped, shaped, strided memory
extension objectsCustom binary storage

The buffer protocol is one of the main reasons Python can interoperate efficiently with binary data, numerical arrays, images, files, sockets, compression libraries, codecs, and native extensions.

65.1 Why the Buffer Protocol Exists

Python objects usually hide their internal representation. A bytes object, a bytearray, an image buffer, and a NumPy array all have different implementation details.

But native code often needs direct access to memory:

hash this byte range
compress this block
decode this image
write this array to a file
pass this tensor to native code
parse this packet without copying

Without a common protocol, each library would need custom APIs for each object type.

The buffer protocol provides one uniform view:

Python object
    exposes memory
        through Py_buffer
            consumed by C code

This lets native code operate on many object types through the same interface.

65.2 Exporters and Consumers

The protocol has two sides.

RoleMeaning
ExporterObject that exposes memory
ConsumerCode that requests and uses memory

Examples:

ExporterConsumer
byteshashing function
bytearraycompression library
array.arraybinary writer
mmap.mmapparser
NumPy arraynative numerical kernel
custom extension typePython memoryview

A consumer asks an exporter for a view. The exporter fills a Py_buffer structure. The consumer uses it. When finished, the consumer releases it.

consumer
    PyObject_GetBuffer(obj, &view, flags)
        exporter fills Py_buffer
    use view.buf, view.len, shape, strides
    PyBuffer_Release(&view)

65.3 Py_buffer

The central structure is Py_buffer.

Conceptually:

typedef struct {
    void *buf;
    PyObject *obj;
    Py_ssize_t len;
    Py_ssize_t itemsize;
    int readonly;
    int ndim;
    char *format;
    Py_ssize_t *shape;
    Py_ssize_t *strides;
    Py_ssize_t *suboffsets;
    void *internal;
} Py_buffer;

Important fields:

FieldMeaning
bufPointer to first accessible byte
objExporting object
lenTotal logical byte length
itemsizeSize of one element
readonlyWhether writes are forbidden
ndimNumber of dimensions
formatElement format string
shapeLength of each dimension
stridesByte step per dimension
suboffsetsIndirect buffer support
internalExporter-private data

A simple byte buffer may use only buf, len, readonly, and obj.

A multidimensional array needs ndim, itemsize, format, shape, and strides.

65.4 Simple Contiguous Buffers

A bytes object exposes a read-only contiguous buffer.

data = b"hello"
view = memoryview(data)
print(view.readonly)
print(view.nbytes)

At the C level:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

/* view.buf points to bytes */
 /* view.len is byte length */

PyBuffer_Release(&view);

PyBUF_SIMPLE requests a simple byte-oriented buffer.

The consumer should treat the memory as a flat array of bytes.

65.5 Writable Buffers

Some objects expose mutable memory.

Example:

data = bytearray(b"hello")
view = memoryview(data)
view[0] = ord("H")
print(data)

Native code can request a writable buffer:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_WRITABLE) < 0) {
    return NULL;
}

char *p = (char *)view.buf;
p[0] = 'H';

PyBuffer_Release(&view);

If the exporter is read-only, the request fails and sets an exception.

This prevents code from mutating immutable objects such as bytes.

65.6 Buffer Flags

Consumers specify what kind of view they need.

Common flags:

FlagMeaning
PyBUF_SIMPLEFlat byte buffer
PyBUF_WRITABLEWritable buffer required
PyBUF_FORMATRequest element format string
PyBUF_NDRequest dimensionality and shape
PyBUF_STRIDESRequest strides
PyBUF_C_CONTIGUOUSRequire C-contiguous layout
PyBUF_F_CONTIGUOUSRequire Fortran-contiguous layout
PyBUF_ANY_CONTIGUOUSRequire any contiguous layout
PyBUF_FULLRequest full buffer information

A consumer should request the weakest view it needs.

For example, a hashing function only needs bytes:

PyBUF_SIMPLE

A numerical kernel may require:

PyBUF_FORMAT | PyBUF_ND | PyBUF_STRIDES

A C library requiring flat contiguous memory should ask for contiguity explicitly.

65.7 Contiguous vs Strided Memory

Not all buffers are contiguous.

A one-dimensional contiguous buffer:

[ a b c d e f ]

has one linear memory range.

A strided view may skip bytes:

[ a _ b _ c _ d _ ]

A two-dimensional array can have row strides:

row 0: a b c
row 1: d e f
row 2: g h i

C-contiguous layout stores rows next to each other:

a b c d e f g h i

Fortran-contiguous layout stores columns next to each other:

a d g b e h c f i

The buffer protocol represents this using:

shape
strides
itemsize

65.8 Shape and Strides

For a two-dimensional array:

shape = [3, 4]
itemsize = 8

means:

3 rows
4 columns
8 bytes per element

Strides describe how many bytes to move to advance along each dimension.

C-contiguous double array:

shape   = [3, 4]
strides = [32, 8]

because:

next row    = 4 elements * 8 bytes = 32 bytes
next column = 1 element * 8 bytes = 8 bytes

Element address:

address(i, j) = buf + i * strides[0] + j * strides[1]

This lets one protocol describe compact arrays, slices, transposes, channels, images, and tensor-like data.

65.9 Format Strings

The format field describes the type of each element.

Examples:

FormatMeaning
Bunsigned byte
bsigned byte
hshort
iint
llong
ffloat
ddouble

A consumer that cares about element type should request PyBUF_FORMAT and validate it.

Example:

if (view.itemsize != sizeof(double) ||
    view.format == NULL ||
    strcmp(view.format, "d") != 0) {
    PyBuffer_Release(&view);
    PyErr_SetString(PyExc_TypeError, "expected double buffer");
    return NULL;
}

Do not assume a buffer contains a particular type unless the protocol data confirms it.

65.10 memoryview

memoryview is the Python-level object for inspecting and slicing buffers.

data = bytearray(b"abcdef")
v = memoryview(data)

print(v[0])
print(v[1:4])

memoryview does not copy the underlying memory. It references the exporter.

This matters:

data = bytearray(b"abc")
v = memoryview(data)

v[0] = ord("A")
print(data)

The output:

bytearray(b'Abc')

The view modifies the original object.

65.11 Lifetime Rules

A buffer view must keep the exporter alive.

In Py_buffer, the obj field stores a reference to the exporting object. The consumer must release the buffer:

PyBuffer_Release(&view);

This call releases exporter-owned state and decrements the reference held by the view.

Common bug:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

/* use view */

return PyLong_FromLong(view.len);  /* missing PyBuffer_Release */

Correct:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

PyObject *result = PyLong_FromSsize_t(view.len);

PyBuffer_Release(&view);

return result;

Every successful PyObject_GetBuffer must have a matching PyBuffer_Release.

65.12 Exporter Restrictions During Active Views

An exporter must not invalidate memory while consumers hold active views.

For example, a bytearray cannot be resized while exported buffers exist:

data = bytearray(b"abc")
v = memoryview(data)

data.append(100)

This raises an error because resizing might move memory and invalidate the view.

Custom exporters must obey the same principle. Once they export a buffer, they must keep the memory valid until the consumer releases it.

65.13 Writing a Buffer Consumer

A simple consumer that sums bytes:

static PyObject *
sum_bytes(PyObject *self, PyObject *args)
{
    PyObject *obj;
    Py_buffer view;

    if (!PyArg_ParseTuple(args, "O", &obj)) {
        return NULL;
    }

    if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
        return NULL;
    }

    unsigned char *p = (unsigned char *)view.buf;
    Py_ssize_t total = 0;

    for (Py_ssize_t i = 0; i < view.len; i++) {
        total += p[i];
    }

    PyBuffer_Release(&view);

    return PyLong_FromSsize_t(total);
}

Python usage:

sum_bytes(b"abc")
sum_bytes(bytearray(b"abc"))
sum_bytes(memoryview(b"abc"))

The same C function works with many exporters.

65.14 Handling Errors in Buffer Consumers

Always release the buffer on every path after acquisition.

static PyObject *
first_byte(PyObject *self, PyObject *args)
{
    PyObject *obj;
    Py_buffer view;

    if (!PyArg_ParseTuple(args, "O", &obj)) {
        return NULL;
    }

    if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
        return NULL;
    }

    if (view.len == 0) {
        PyBuffer_Release(&view);
        PyErr_SetString(PyExc_ValueError, "empty buffer");
        return NULL;
    }

    unsigned char value = ((unsigned char *)view.buf)[0];

    PyBuffer_Release(&view);

    return PyLong_FromUnsignedLong(value);
}

The pattern mirrors reference cleanup.

65.15 Requiring Contiguous Memory

Some C libraries require a single contiguous memory block.

Ask explicitly:

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

if (!PyBuffer_IsContiguous(&view, 'C')) {
    PyBuffer_Release(&view);
    PyErr_SetString(PyExc_BufferError, "expected C-contiguous buffer");
    return NULL;
}

For non-contiguous input, consumers can either reject it or copy it into a contiguous buffer.

Rejecting is simpler. Copying is more flexible.

65.16 Copying from Non-Contiguous Buffers

CPython provides helpers for copying buffer data into contiguous storage.

Conceptual pattern:

Py_buffer view;

if (PyObject_GetBuffer(obj, &view, PyBUF_FULL_RO) < 0) {
    return NULL;
}

char *copy = PyMem_Malloc(view.len);
if (copy == NULL) {
    PyBuffer_Release(&view);
    return PyErr_NoMemory();
}

if (PyBuffer_ToContiguous(copy, &view, view.len, 'C') < 0) {
    PyMem_Free(copy);
    PyBuffer_Release(&view);
    return NULL;
}

/* use copy */

PyMem_Free(copy);
PyBuffer_Release(&view);

This keeps the C library interface simple while accepting strided inputs.

65.17 Writing a Buffer Exporter

A custom type exports a buffer by implementing bf_getbuffer and bf_releasebuffer.

Example object:

typedef struct {
    PyObject_HEAD
    char *data;
    Py_ssize_t len;
    int exports;
} BlobObject;

Buffer methods:

static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
    if (view == NULL) {
        PyErr_SetString(PyExc_BufferError, "NULL view");
        return -1;
    }

    return PyBuffer_FillInfo(
        view,
        (PyObject *)self,
        self->data,
        self->len,
        0,
        flags
    );
}

static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
    /* optional exporter cleanup */
}

Attach through PyBufferProcs:

static PyBufferProcs Blob_bufferprocs = {
    .bf_getbuffer = (getbufferproc)Blob_getbuffer,
    .bf_releasebuffer = (releasebufferproc)Blob_releasebuffer,
};

Then in the type:

.tp_as_buffer = &Blob_bufferprocs,

Now Python can do:

b = Blob(...)
v = memoryview(b)

65.18 Tracking Active Exports

If an exporter owns resizable memory, it should track active exports.

static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
    int ret = PyBuffer_FillInfo(
        view,
        (PyObject *)self,
        self->data,
        self->len,
        0,
        flags
    );

    if (ret == 0) {
        self->exports++;
    }

    return ret;
}

static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
    self->exports--;
}

Before resizing:

if (self->exports > 0) {
    PyErr_SetString(
        PyExc_BufferError,
        "cannot resize while buffers are exported"
    );
    return NULL;
}

This prevents dangling pointers.

65.19 Read-Only Exporters

The readonly argument to PyBuffer_FillInfo controls writability.

Read-only:

PyBuffer_FillInfo(
    view,
    (PyObject *)self,
    self->data,
    self->len,
    1,
    flags
);

Writable:

PyBuffer_FillInfo(
    view,
    (PyObject *)self,
    self->data,
    self->len,
    0,
    flags
);

If a consumer requests PyBUF_WRITABLE from a read-only exporter, the request fails.

This is how immutable binary objects protect their storage.

65.20 Multidimensional Exporters

For arrays, exporters must provide shape, strides, item size, and format.

Example for a 2D double matrix:

rows = 3
cols = 4
itemsize = 8
shape = [3, 4]
strides = [32, 8]
format = "d"

The exporter must ensure that the shape and stride arrays remain valid while the view exists. They are often stored in the object itself or in exporter-private memory.

A full exporter must handle flag requests correctly. If the consumer asks for shape or format and the exporter cannot provide them, it should fail with BufferError.

65.21 Buffer Protocol and Zero Copy

The buffer protocol enables zero-copy paths.

Example:

socket reads bytes
bytearray stores mutable memory
memoryview slices without copy
parser reads view
native extension decodes fields

For large data, zero-copy can dominate performance.

Copying a 1 GB buffer costs both memory bandwidth and allocation overhead. Passing a view costs pointer setup and lifetime tracking.

This is why the protocol matters for:

images
audio
video
tensors
network packets
database pages
compressed blocks
memory-mapped files

65.22 Buffer Protocol and the GIL

Acquiring and releasing buffers touches Python objects, so it normally requires the GIL.

But after acquiring a stable buffer, native code may release the GIL while processing raw memory, if it does not call Python APIs and the exporter guarantees valid storage.

Pattern:

if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
    return NULL;
}

Py_BEGIN_ALLOW_THREADS

process_raw_memory(view.buf, view.len);

Py_END_ALLOW_THREADS

PyBuffer_Release(&view);

This allows CPU-bound native processing to run without blocking other Python threads from acquiring the GIL.

65.23 Buffer Protocol vs Sequence Protocol

The sequence protocol exposes elements as Python objects.

x = obj[i]

The buffer protocol exposes raw memory.

buf + offset

Comparison:

FeatureSequence protocolBuffer protocol
Access levelPython objectsRaw memory
Copy-free binary accessNoYes
Type metadataPython-levelFormat string
Multidimensional layoutIndirectShape and strides
Use caseGeneral containersBinary and numerical data

A list of integers does not normally expose a buffer because its memory contains pointers to Python objects, not raw integer values.

An array.array("i") can expose a buffer because its memory stores raw C integers.

65.24 Buffer Protocol vs C API Type-Specific Access

Some APIs expose direct access to specific object types.

Example:

char *p = PyBytes_AS_STRING(obj);
Py_ssize_t n = PyBytes_GET_SIZE(obj);

This works only for bytes.

The buffer protocol works across many exporters.

ApproachScope
PyBytes_AS_STRINGbytes only
PyByteArray_AS_STRINGbytearray only
PyObject_GetBufferAny buffer exporter

Prefer the buffer protocol when accepting generic binary data.

65.25 Common Buffer Bugs

BugCause
Missing PyBuffer_ReleaseExporter stays pinned
Writing to read-only memoryMissing writable check
Assuming contiguityIgnoring strides
Assuming element typeIgnoring format
Resizing during exportInvalidates active views
Releasing GIL too earlyPython API calls without GIL
Returning pointer after releaseDangling pointer
Storing view.buf long-termBuffer lifetime violation

The safest rule: treat view.buf as valid only between successful PyObject_GetBuffer and matching PyBuffer_Release.

65.26 Practical Design Guidelines

For consumers:

NeedRequest
Raw bytes onlyPyBUF_SIMPLE
Must writePyBUF_WRITABLE
Must know typePyBUF_FORMAT
Must support arrays`PyBUF_ND
Must call C libraryRequire contiguity or copy

For exporters:

RequirementRule
Memory can resizeTrack active exports
Object stores referencesAdd GC support separately
Memory read-onlyMark buffer read-only
Multidimensional dataProvide stable shape and strides
Custom allocationKeep storage valid until release

65.27 Chapter Summary

The buffer protocol is CPython’s common interface for exposing raw memory. It lets objects such as bytes, bytearray, memoryview, array.array, mmap, numerical arrays, and custom extension types share memory with native code without copying.

Consumers acquire a Py_buffer, use the memory and metadata, then release it. Exporters fill the view and keep memory valid while it is exported.

The protocol supports read-only memory, writable memory, contiguous buffers, multidimensional arrays, typed elements, strides, and zero-copy slicing. It is central to CPython’s performance story for binary data and native interoperability.