Py_buffer, PyBUF_* flags, PyObject_GetBuffer, and implementing the buffer protocol in a custom type.
The buffer protocol is CPython’s low-level interface for sharing raw memory between Python objects without copying. It allows one object to expose a contiguous or strided block of memory, and another object to read or write that memory through a common C structure.
The protocol is used by objects such as:
| Object | Buffer use |
|---|---|
bytes | Read-only contiguous byte storage |
bytearray | Mutable contiguous byte storage |
memoryview | General Python-level buffer view |
array.array | Typed contiguous storage |
mmap.mmap | Memory-mapped file storage |
numpy.ndarray | Typed, shaped, strided memory |
| extension objects | Custom binary storage |
The buffer protocol is one of the main reasons Python can interoperate efficiently with binary data, numerical arrays, images, files, sockets, compression libraries, codecs, and native extensions.
65.1 Why the Buffer Protocol Exists
Python objects usually hide their internal representation. A bytes object, a bytearray, an image buffer, and a NumPy array all have different implementation details.
But native code often needs direct access to memory:
hash this byte range
compress this block
decode this image
write this array to a file
pass this tensor to native code
parse this packet without copyingWithout a common protocol, each library would need custom APIs for each object type.
The buffer protocol provides one uniform view:
Python object
exposes memory
through Py_buffer
consumed by C codeThis lets native code operate on many object types through the same interface.
65.2 Exporters and Consumers
The protocol has two sides.
| Role | Meaning |
|---|---|
| Exporter | Object that exposes memory |
| Consumer | Code that requests and uses memory |
Examples:
| Exporter | Consumer |
|---|---|
bytes | hashing function |
bytearray | compression library |
array.array | binary writer |
mmap.mmap | parser |
| NumPy array | native numerical kernel |
| custom extension type | Python memoryview |
A consumer asks an exporter for a view. The exporter fills a Py_buffer structure. The consumer uses it. When finished, the consumer releases it.
consumer
PyObject_GetBuffer(obj, &view, flags)
exporter fills Py_buffer
use view.buf, view.len, shape, strides
PyBuffer_Release(&view)65.3 Py_buffer
The central structure is Py_buffer.
Conceptually:
typedef struct {
void *buf;
PyObject *obj;
Py_ssize_t len;
Py_ssize_t itemsize;
int readonly;
int ndim;
char *format;
Py_ssize_t *shape;
Py_ssize_t *strides;
Py_ssize_t *suboffsets;
void *internal;
} Py_buffer;Important fields:
| Field | Meaning |
|---|---|
buf | Pointer to first accessible byte |
obj | Exporting object |
len | Total logical byte length |
itemsize | Size of one element |
readonly | Whether writes are forbidden |
ndim | Number of dimensions |
format | Element format string |
shape | Length of each dimension |
strides | Byte step per dimension |
suboffsets | Indirect buffer support |
internal | Exporter-private data |
A simple byte buffer may use only buf, len, readonly, and obj.
A multidimensional array needs ndim, itemsize, format, shape, and strides.
65.4 Simple Contiguous Buffers
A bytes object exposes a read-only contiguous buffer.
data = b"hello"
view = memoryview(data)
print(view.readonly)
print(view.nbytes)At the C level:
Py_buffer view;
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
/* view.buf points to bytes */
/* view.len is byte length */
PyBuffer_Release(&view);PyBUF_SIMPLE requests a simple byte-oriented buffer.
The consumer should treat the memory as a flat array of bytes.
65.5 Writable Buffers
Some objects expose mutable memory.
Example:
data = bytearray(b"hello")
view = memoryview(data)
view[0] = ord("H")
print(data)Native code can request a writable buffer:
Py_buffer view;
if (PyObject_GetBuffer(obj, &view, PyBUF_WRITABLE) < 0) {
return NULL;
}
char *p = (char *)view.buf;
p[0] = 'H';
PyBuffer_Release(&view);If the exporter is read-only, the request fails and sets an exception.
This prevents code from mutating immutable objects such as bytes.
65.6 Buffer Flags
Consumers specify what kind of view they need.
Common flags:
| Flag | Meaning |
|---|---|
PyBUF_SIMPLE | Flat byte buffer |
PyBUF_WRITABLE | Writable buffer required |
PyBUF_FORMAT | Request element format string |
PyBUF_ND | Request dimensionality and shape |
PyBUF_STRIDES | Request strides |
PyBUF_C_CONTIGUOUS | Require C-contiguous layout |
PyBUF_F_CONTIGUOUS | Require Fortran-contiguous layout |
PyBUF_ANY_CONTIGUOUS | Require any contiguous layout |
PyBUF_FULL | Request full buffer information |
A consumer should request the weakest view it needs.
For example, a hashing function only needs bytes:
PyBUF_SIMPLEA numerical kernel may require:
PyBUF_FORMAT | PyBUF_ND | PyBUF_STRIDESA C library requiring flat contiguous memory should ask for contiguity explicitly.
65.7 Contiguous vs Strided Memory
Not all buffers are contiguous.
A one-dimensional contiguous buffer:
[ a b c d e f ]has one linear memory range.
A strided view may skip bytes:
[ a _ b _ c _ d _ ]A two-dimensional array can have row strides:
row 0: a b c
row 1: d e f
row 2: g h iC-contiguous layout stores rows next to each other:
a b c d e f g h iFortran-contiguous layout stores columns next to each other:
a d g b e h c f iThe buffer protocol represents this using:
shape
strides
itemsize65.8 Shape and Strides
For a two-dimensional array:
shape = [3, 4]
itemsize = 8means:
3 rows
4 columns
8 bytes per elementStrides describe how many bytes to move to advance along each dimension.
C-contiguous double array:
shape = [3, 4]
strides = [32, 8]because:
next row = 4 elements * 8 bytes = 32 bytes
next column = 1 element * 8 bytes = 8 bytesElement address:
address(i, j) = buf + i * strides[0] + j * strides[1]This lets one protocol describe compact arrays, slices, transposes, channels, images, and tensor-like data.
65.9 Format Strings
The format field describes the type of each element.
Examples:
| Format | Meaning |
|---|---|
B | unsigned byte |
b | signed byte |
h | short |
i | int |
l | long |
f | float |
d | double |
A consumer that cares about element type should request PyBUF_FORMAT and validate it.
Example:
if (view.itemsize != sizeof(double) ||
view.format == NULL ||
strcmp(view.format, "d") != 0) {
PyBuffer_Release(&view);
PyErr_SetString(PyExc_TypeError, "expected double buffer");
return NULL;
}Do not assume a buffer contains a particular type unless the protocol data confirms it.
65.10 memoryview
memoryview is the Python-level object for inspecting and slicing buffers.
data = bytearray(b"abcdef")
v = memoryview(data)
print(v[0])
print(v[1:4])memoryview does not copy the underlying memory. It references the exporter.
This matters:
data = bytearray(b"abc")
v = memoryview(data)
v[0] = ord("A")
print(data)The output:
bytearray(b'Abc')The view modifies the original object.
65.11 Lifetime Rules
A buffer view must keep the exporter alive.
In Py_buffer, the obj field stores a reference to the exporting object. The consumer must release the buffer:
PyBuffer_Release(&view);This call releases exporter-owned state and decrements the reference held by the view.
Common bug:
Py_buffer view;
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
/* use view */
return PyLong_FromLong(view.len); /* missing PyBuffer_Release */Correct:
Py_buffer view;
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
PyObject *result = PyLong_FromSsize_t(view.len);
PyBuffer_Release(&view);
return result;Every successful PyObject_GetBuffer must have a matching PyBuffer_Release.
65.12 Exporter Restrictions During Active Views
An exporter must not invalidate memory while consumers hold active views.
For example, a bytearray cannot be resized while exported buffers exist:
data = bytearray(b"abc")
v = memoryview(data)
data.append(100)This raises an error because resizing might move memory and invalidate the view.
Custom exporters must obey the same principle. Once they export a buffer, they must keep the memory valid until the consumer releases it.
65.13 Writing a Buffer Consumer
A simple consumer that sums bytes:
static PyObject *
sum_bytes(PyObject *self, PyObject *args)
{
PyObject *obj;
Py_buffer view;
if (!PyArg_ParseTuple(args, "O", &obj)) {
return NULL;
}
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
unsigned char *p = (unsigned char *)view.buf;
Py_ssize_t total = 0;
for (Py_ssize_t i = 0; i < view.len; i++) {
total += p[i];
}
PyBuffer_Release(&view);
return PyLong_FromSsize_t(total);
}Python usage:
sum_bytes(b"abc")
sum_bytes(bytearray(b"abc"))
sum_bytes(memoryview(b"abc"))The same C function works with many exporters.
65.14 Handling Errors in Buffer Consumers
Always release the buffer on every path after acquisition.
static PyObject *
first_byte(PyObject *self, PyObject *args)
{
PyObject *obj;
Py_buffer view;
if (!PyArg_ParseTuple(args, "O", &obj)) {
return NULL;
}
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
if (view.len == 0) {
PyBuffer_Release(&view);
PyErr_SetString(PyExc_ValueError, "empty buffer");
return NULL;
}
unsigned char value = ((unsigned char *)view.buf)[0];
PyBuffer_Release(&view);
return PyLong_FromUnsignedLong(value);
}The pattern mirrors reference cleanup.
65.15 Requiring Contiguous Memory
Some C libraries require a single contiguous memory block.
Ask explicitly:
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
if (!PyBuffer_IsContiguous(&view, 'C')) {
PyBuffer_Release(&view);
PyErr_SetString(PyExc_BufferError, "expected C-contiguous buffer");
return NULL;
}For non-contiguous input, consumers can either reject it or copy it into a contiguous buffer.
Rejecting is simpler. Copying is more flexible.
65.16 Copying from Non-Contiguous Buffers
CPython provides helpers for copying buffer data into contiguous storage.
Conceptual pattern:
Py_buffer view;
if (PyObject_GetBuffer(obj, &view, PyBUF_FULL_RO) < 0) {
return NULL;
}
char *copy = PyMem_Malloc(view.len);
if (copy == NULL) {
PyBuffer_Release(&view);
return PyErr_NoMemory();
}
if (PyBuffer_ToContiguous(copy, &view, view.len, 'C') < 0) {
PyMem_Free(copy);
PyBuffer_Release(&view);
return NULL;
}
/* use copy */
PyMem_Free(copy);
PyBuffer_Release(&view);This keeps the C library interface simple while accepting strided inputs.
65.17 Writing a Buffer Exporter
A custom type exports a buffer by implementing bf_getbuffer and bf_releasebuffer.
Example object:
typedef struct {
PyObject_HEAD
char *data;
Py_ssize_t len;
int exports;
} BlobObject;Buffer methods:
static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
if (view == NULL) {
PyErr_SetString(PyExc_BufferError, "NULL view");
return -1;
}
return PyBuffer_FillInfo(
view,
(PyObject *)self,
self->data,
self->len,
0,
flags
);
}
static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
/* optional exporter cleanup */
}Attach through PyBufferProcs:
static PyBufferProcs Blob_bufferprocs = {
.bf_getbuffer = (getbufferproc)Blob_getbuffer,
.bf_releasebuffer = (releasebufferproc)Blob_releasebuffer,
};Then in the type:
.tp_as_buffer = &Blob_bufferprocs,Now Python can do:
b = Blob(...)
v = memoryview(b)65.18 Tracking Active Exports
If an exporter owns resizable memory, it should track active exports.
static int
Blob_getbuffer(BlobObject *self, Py_buffer *view, int flags)
{
int ret = PyBuffer_FillInfo(
view,
(PyObject *)self,
self->data,
self->len,
0,
flags
);
if (ret == 0) {
self->exports++;
}
return ret;
}
static void
Blob_releasebuffer(BlobObject *self, Py_buffer *view)
{
self->exports--;
}Before resizing:
if (self->exports > 0) {
PyErr_SetString(
PyExc_BufferError,
"cannot resize while buffers are exported"
);
return NULL;
}This prevents dangling pointers.
65.19 Read-Only Exporters
The readonly argument to PyBuffer_FillInfo controls writability.
Read-only:
PyBuffer_FillInfo(
view,
(PyObject *)self,
self->data,
self->len,
1,
flags
);Writable:
PyBuffer_FillInfo(
view,
(PyObject *)self,
self->data,
self->len,
0,
flags
);If a consumer requests PyBUF_WRITABLE from a read-only exporter, the request fails.
This is how immutable binary objects protect their storage.
65.20 Multidimensional Exporters
For arrays, exporters must provide shape, strides, item size, and format.
Example for a 2D double matrix:
rows = 3
cols = 4
itemsize = 8
shape = [3, 4]
strides = [32, 8]
format = "d"The exporter must ensure that the shape and stride arrays remain valid while the view exists. They are often stored in the object itself or in exporter-private memory.
A full exporter must handle flag requests correctly. If the consumer asks for shape or format and the exporter cannot provide them, it should fail with BufferError.
65.21 Buffer Protocol and Zero Copy
The buffer protocol enables zero-copy paths.
Example:
socket reads bytes
↓
bytearray stores mutable memory
↓
memoryview slices without copy
↓
parser reads view
↓
native extension decodes fieldsFor large data, zero-copy can dominate performance.
Copying a 1 GB buffer costs both memory bandwidth and allocation overhead. Passing a view costs pointer setup and lifetime tracking.
This is why the protocol matters for:
images
audio
video
tensors
network packets
database pages
compressed blocks
memory-mapped files65.22 Buffer Protocol and the GIL
Acquiring and releasing buffers touches Python objects, so it normally requires the GIL.
But after acquiring a stable buffer, native code may release the GIL while processing raw memory, if it does not call Python APIs and the exporter guarantees valid storage.
Pattern:
if (PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE) < 0) {
return NULL;
}
Py_BEGIN_ALLOW_THREADS
process_raw_memory(view.buf, view.len);
Py_END_ALLOW_THREADS
PyBuffer_Release(&view);This allows CPU-bound native processing to run without blocking other Python threads from acquiring the GIL.
65.23 Buffer Protocol vs Sequence Protocol
The sequence protocol exposes elements as Python objects.
x = obj[i]The buffer protocol exposes raw memory.
buf + offsetComparison:
| Feature | Sequence protocol | Buffer protocol |
|---|---|---|
| Access level | Python objects | Raw memory |
| Copy-free binary access | No | Yes |
| Type metadata | Python-level | Format string |
| Multidimensional layout | Indirect | Shape and strides |
| Use case | General containers | Binary and numerical data |
A list of integers does not normally expose a buffer because its memory contains pointers to Python objects, not raw integer values.
An array.array("i") can expose a buffer because its memory stores raw C integers.
65.24 Buffer Protocol vs C API Type-Specific Access
Some APIs expose direct access to specific object types.
Example:
char *p = PyBytes_AS_STRING(obj);
Py_ssize_t n = PyBytes_GET_SIZE(obj);This works only for bytes.
The buffer protocol works across many exporters.
| Approach | Scope |
|---|---|
PyBytes_AS_STRING | bytes only |
PyByteArray_AS_STRING | bytearray only |
PyObject_GetBuffer | Any buffer exporter |
Prefer the buffer protocol when accepting generic binary data.
65.25 Common Buffer Bugs
| Bug | Cause |
|---|---|
Missing PyBuffer_Release | Exporter stays pinned |
| Writing to read-only memory | Missing writable check |
| Assuming contiguity | Ignoring strides |
| Assuming element type | Ignoring format |
| Resizing during export | Invalidates active views |
| Releasing GIL too early | Python API calls without GIL |
| Returning pointer after release | Dangling pointer |
Storing view.buf long-term | Buffer lifetime violation |
The safest rule: treat view.buf as valid only between successful PyObject_GetBuffer and matching PyBuffer_Release.
65.26 Practical Design Guidelines
For consumers:
| Need | Request |
|---|---|
| Raw bytes only | PyBUF_SIMPLE |
| Must write | PyBUF_WRITABLE |
| Must know type | PyBUF_FORMAT |
| Must support arrays | `PyBUF_ND |
| Must call C library | Require contiguity or copy |
For exporters:
| Requirement | Rule |
|---|---|
| Memory can resize | Track active exports |
| Object stores references | Add GC support separately |
| Memory read-only | Mark buffer read-only |
| Multidimensional data | Provide stable shape and strides |
| Custom allocation | Keep storage valid until release |
65.27 Chapter Summary
The buffer protocol is CPython’s common interface for exposing raw memory. It lets objects such as bytes, bytearray, memoryview, array.array, mmap, numerical arrays, and custom extension types share memory with native code without copying.
Consumers acquire a Py_buffer, use the memory and metadata, then release it. Exporters fill the view and keep memory valid while it is exported.
The protocol supports read-only memory, writable memory, contiguous buffers, multidimensional arrays, typed elements, strides, and zero-copy slicing. It is central to CPython’s performance story for binary data and native interoperability.