PEP 684 per-interpreter GIL: isolated interpreter state, shared-nothing model, and module isolation requirements.
The per-interpreter GIL is a CPython runtime design where each subinterpreter owns its own Global Interpreter Lock instead of all interpreters sharing one process-wide lock.
The traditional model used one GIL for the whole process:
process
runtime
global GIL
interpreter A
interpreter B
interpreter CThe per-interpreter model moves the lock down into each interpreter:
process
runtime
interpreter A
GIL A
interpreter B
GIL B
interpreter C
GIL CThis allows separate interpreters to execute Python bytecode in parallel, as long as they do not share unsafe runtime state.
The per-interpreter GIL is different from removing the GIL completely. It preserves the GIL inside each interpreter, but gives each interpreter its own independent lock.
94.1 Why Per-Interpreter GIL Exists
The original GIL solved many correctness problems:
reference counting
object mutation
allocator state
import state
runtime caches
C extension assumptionsBut one process-wide GIL also meant that all Python threads in the process competed for one global execution lock.
Subinterpreters already existed in CPython. They allowed multiple interpreter states inside one process, but historically they still shared too much global runtime state to provide true parallel execution.
The per-interpreter GIL attempts a middle path:
keep the GIL model inside one interpreter
allow multiple interpreters to run independently
reduce global runtime sharing
avoid requiring all code to become fully free-threadedIt is less radical than free-threaded CPython, but still requires deep runtime changes.
94.2 Interpreter State
A CPython process contains runtime state and one or more interpreter states.
Conceptually:
typedef struct _is PyInterpreterState;
typedef struct _ts PyThreadState;A PyInterpreterState owns interpreter-level data.
Examples:
module dictionary
builtins
import machinery
codec state
warnings state
GC state
thread states
interpreter configuration
runtime cachesA PyThreadState represents one thread executing inside an interpreter.
Conceptually:
PyInterpreterState
PyThreadState
PyThreadState
PyThreadStateThe per-interpreter GIL makes the interpreter state the unit of bytecode execution locking.
94.3 Traditional Process-Wide GIL
In the older model, the process had one effective GIL.
Even if a process contained multiple interpreters:
interpreter A
interpreter Bonly one thread could execute Python bytecode at a time across both.
Conceptually:
Thread 1 in interpreter A acquires global GIL
Thread 2 in interpreter B waits
Thread 1 releases global GIL
Thread 2 acquires global GILThis limited the scalability of subinterpreters. They provided isolation of some state, but not parallel Python execution.
94.4 Per-Interpreter GIL Model
With a per-interpreter GIL, each interpreter has its own lock.
Thread 1 in interpreter A acquires GIL A
Thread 2 in interpreter B acquires GIL B
both execute Python bytecode concurrentlyThis changes the concurrency model.
Parallelism becomes possible when execution is split across interpreters rather than merely across threads in one interpreter.
A process can then use multiple CPU cores without removing the GIL inside each interpreter.
94.5 Difference From Free-Threaded CPython
Per-interpreter GIL and free-threaded CPython solve related problems differently.
| Model | Locking design | Parallel bytecode execution |
|---|---|---|
| Traditional GIL | One GIL per process | No, not across Python threads |
| Per-interpreter GIL | One GIL per interpreter | Yes, across interpreters |
| Free-threaded CPython | No traditional GIL | Yes, inside one interpreter |
The per-interpreter GIL keeps many old assumptions valid within each interpreter:
only one thread executes bytecode in this interpreter
reference counting remains simpler locally
container mutation remains serialized locally
many C extension assumptions remain closer to traditional CPythonFree-threaded CPython removes that protection and replaces it with fine-grained synchronization.
94.6 Why Subinterpreter Isolation Matters
Per-interpreter GIL only works if interpreters do not share unsafe mutable state.
If two interpreters share the same mutable object:
interpreter A mutates object
interpreter B mutates same objectthen separate GILs do not protect the object.
The old global GIL accidentally protected shared runtime state. Once the GIL becomes per-interpreter, shared state becomes dangerous.
Therefore CPython must move data from process-global state into interpreter-local state.
Examples:
module state
import state
GC state
exception state
runtime caches
interned objects
type metadata where possibleThe more state becomes interpreter-local, the safer parallel subinterpreters become.
94.7 Runtime Global State
CPython historically used many process-global variables.
Examples:
static runtime caches
global singletons
global freelists
global type state
global import machinery data
global extension module stateThese globals were convenient because the process-wide GIL serialized access.
With a per-interpreter GIL, such globals become concurrency hazards.
A process-global mutable value must either be:
made immutable
protected by its own lock
moved into PyInterpreterState
made thread-local
eliminatedThis creates a large refactoring burden.
94.8 Extension Module State
Extension modules are a major challenge.
Old extension modules often used process-global C variables:
static PyObject *cache;
static int initialized;
static PyTypeObject MyType;This pattern assumes one global interpreter context.
In a subinterpreter world, it is problematic.
If two interpreters import the same extension module, global state may be shared accidentally:
interpreter A imports module
interpreter B imports module
both use same static C globalsThat can break isolation.
Modern extension design prefers per-module state:
typedef struct {
PyObject *cache;
PyObject *error_type;
} ModuleState;Each interpreter gets its own module instance and its own module state.
94.9 Multi-Phase Module Initialization
Multi-phase initialization helps extension modules work with subinterpreters.
Instead of one global initialization function building one process-wide module object, an extension can define module creation and execution phases.
Conceptually:
create module object
allocate per-module state
execute module initialization
store state in module instanceThis allows each interpreter to get a separate module instance.
The extension can retrieve its state from the module object rather than from static globals.
Simplified pattern:
typedef struct {
PyObject *CacheType;
} mod_state;
static int
module_exec(PyObject *m)
{
mod_state *st = PyModule_GetState(m);
if (st == NULL) {
return -1;
}
st->CacheType = create_cache_type();
if (st->CacheType == NULL) {
return -1;
}
return PyModule_AddObjectRef(m, "Cache", st->CacheType);
}This is more compatible with multiple interpreters.
94.10 Objects Cannot Be Freely Shared
Ordinary Python objects generally cannot be passed directly between interpreters.
A list created in interpreter A belongs to interpreter A:
xs = [1, 2, 3]The list references type objects, allocator state, GC metadata, and other interpreter-specific structures.
Passing that list directly into interpreter B would create ownership and synchronization problems.
Instead, inter-interpreter communication should use safe channels:
serialization
copying
immutable shareable objects
explicit cross-interpreter data APIs
message passingThis keeps interpreter heaps separate.
94.11 Shareable Objects
Some objects are safer to share than others.
Good candidates:
None
booleans
small immutable values
bytes
strings
immutable memory views
simple serialized dataBad candidates:
list
dict
set
user-defined mutable objects
open files
generators
frames
coroutines
locksThe safest model treats interpreters as isolated runtimes that exchange messages rather than share object graphs.
94.12 Message Passing Model
A practical subinterpreter design resembles actor-style concurrency.
Each interpreter owns its state:
interpreter A owns heap A
interpreter B owns heap BThey communicate through explicit channels:
interpreter A sends message
runtime copies or transfers safe data
interpreter B receives messageThis avoids shared mutable state.
Conceptually:
worker interpreter
receive task
run Python code
send resultA thread pool based on subinterpreters can then run CPU-bound Python code in parallel, while preserving a simpler per-interpreter GIL model.
94.13 Reference Counting With Per-Interpreter GIL
Inside one interpreter, reference counting remains protected by that interpreter’s GIL.
Thread A in interpreter X holds GIL X
updates refcounts for objects in interpreter XThis avoids full atomic reference counting for ordinary interpreter-local objects.
However, globally shared immortal objects and runtime-level objects still require special handling.
The rule becomes:
interpreter-local objects use interpreter-local protection
process-global objects need global safetyThis boundary is central to the design.
94.14 Garbage Collection Per Interpreter
The cyclic garbage collector is naturally interpreter-scoped.
Each interpreter has its own object graph:
interpreter A heap
interpreter B heapEach graph can be collected independently.
This has useful properties:
GC pauses can be interpreter-local
cycles do not cross interpreter heaps
object ownership is clearer
finalizers run in the owning interpreterBut it requires that object graphs do not contain unsafe cross-interpreter references.
94.15 Import System Isolation
The import system is another major area.
Each interpreter should have its own module table:
import sys
sys.modulesIf interpreter A imports mymodule, and interpreter B imports mymodule, they should generally get separate module objects.
This preserves module globals isolation.
Example:
# in interpreter A
import config
config.value = 10
# in interpreter B
import config
print(config.value)Interpreter B should not accidentally observe interpreter A’s module global mutation unless communication is explicit.
94.16 Builtins and Runtime Constants
Builtins are heavily shared in traditional CPython.
Examples:
None
True
False
Ellipsis
NotImplemented
int
str
list
dict
object
typeSome of these can be safely immortal and shared. Others may require interpreter-local state or careful synchronization.
The runtime must classify objects by lifetime and ownership:
| Object kind | Typical handling |
|---|---|
| Immutable singleton | Immortal and shareable |
| Builtin type | Often shared or specially managed |
| Module object | Interpreter-local |
| User object | Interpreter-local |
| Frame | Thread/interpreter-local |
| Mutable cache | Interpreter-local or locked |
94.17 Type Objects and Interpreter Isolation
Type objects are complicated.
A type object may hold:
method table
slot functions
base classes
MRO
subclasses
dict
cache data
module state referencesStatic built-in types can often be shared because they are effectively permanent and carefully managed.
Heap types created by Python code are interpreter-local.
Example:
class User:
passThe resulting User type belongs to the interpreter that created it.
Sharing it directly with another interpreter would expose mutable type dictionaries, subclass lists, descriptors, and cached lookup state.
94.18 Thread State and GIL Ownership
Each OS thread executing Python code has a PyThreadState.
In the per-interpreter model, the thread state belongs to one interpreter at a time:
PyThreadState
interpreter pointer
current frame
exception state
recursion stateTo execute bytecode, the thread must acquire that interpreter’s GIL.
Conceptually:
attach thread state to interpreter
acquire interpreter GIL
execute Python code
release interpreter GIL
detach or switchSwitching between interpreters is possible, but must be explicit and carefully managed.
94.19 Scheduling Model
The per-interpreter GIL does not by itself create a scheduler.
It provides a locking model.
Scheduling still depends on:
OS threads
application thread pools
embedding host
subinterpreter API
task dispatch systemA runtime can create several interpreters and assign one worker thread to each.
Conceptually:
main interpreter
dispatch task 1 to interpreter A
dispatch task 2 to interpreter B
dispatch task 3 to interpreter CEach worker interpreter can execute Python code independently.
94.20 Comparison With Multiprocessing
Subinterpreters with per-interpreter GIL overlap with multiprocessing, but they have different tradeoffs.
| Feature | Multiprocessing | Subinterpreters |
|---|---|---|
| Isolation | OS process boundary | Interpreter boundary |
| Parallelism | Yes | Yes, with per-interpreter GIL |
| Memory sharing | Separate address spaces | Same process address space |
| Startup cost | Higher | Lower |
| Crash isolation | Stronger | Weaker |
| Object sharing | Serialization needed | Usually message passing or restricted sharing |
| C extension safety | Process-isolated | Must be subinterpreter-safe |
Subinterpreters can be lighter than processes, but they provide weaker fault isolation.
A crash in native code can still bring down the whole process.
94.21 Comparison With Threads
Normal Python threads share one interpreter.
one interpreter
many threads
one GILSubinterpreter workers use multiple interpreters:
many interpreters
one or more threads each
one GIL eachThreads are easier for shared-memory programming.
Subinterpreters are better for isolated parallel execution.
The programming model shifts from shared objects to explicit communication.
94.22 Advantages
Per-interpreter GIL offers several advantages:
true parallel bytecode execution across interpreters
less radical than full free-threading
clearer isolation boundary
better fit for plugin systems
lower overhead than multiprocessing in some cases
keeps many traditional GIL assumptions inside one interpreterIt can support workloads such as:
CPU-bound task pools
server plugin isolation
parallel data processing
embedded scripting runtimes
independent user code execution94.23 Costs and Limitations
The model also has costs:
extension modules must support subinterpreters correctly
objects cannot be freely shared
global runtime state must be removed or protected
debugging becomes more complex
memory use may increase due to duplicated interpreter state
some libraries assume one interpreter per processA subinterpreter pool may duplicate:
module imports
module globals
caches
class objects
runtime metadataThis can consume more memory than a thread pool.
94.24 Common Misunderstandings
The per-interpreter GIL does not mean every Python thread runs in parallel.
Threads inside the same interpreter still share that interpreter’s GIL.
It also does not mean Python objects are automatically thread-safe across interpreters.
The correct model is:
parallelism comes from multiple interpreters
safety comes from isolation
communication must be explicit94.25 Design Pressure on CPython
Per-interpreter GIL forces CPython to become more modular internally.
Old style:
static PyObject *global_cache;New style:
state belongs to runtime, interpreter, module, or threadEvery piece of state needs a clear owner.
This improves architecture even outside subinterpreters.
It makes CPython less dependent on hidden global variables and easier to reason about in concurrent settings.
94.26 Mental Model
Use this model:
A CPython process may contain many interpreters.
Each interpreter has:
its own GIL
its own module table
its own import state
its own garbage collector state
its own thread states
its own object heap boundaries
Threads can run Python bytecode in parallel when they execute in different interpreters.
Objects should stay inside the interpreter that owns them.
Communication should use explicit transfer, copying, serialization, or safe shareable values.This model explains why per-interpreter GIL is useful and why it requires substantial runtime refactoring.
94.27 Chapter Summary
The per-interpreter GIL moves CPython from one process-wide execution lock to one lock per interpreter.
This enables parallel bytecode execution across subinterpreters while preserving the familiar GIL model inside each interpreter.
The design depends on interpreter isolation:
interpreter-local module state
interpreter-local object ownership
reduced process-global mutable state
subinterpreter-safe extension modules
explicit communication between interpretersPer-interpreter GIL is a major step toward scalable CPython concurrency. It provides a middle path between traditional single-GIL CPython and fully free-threaded CPython.