48. Subinterpreters

A subinterpreter is an interpreter instance inside the same CPython process. It has its own interpreter state, module table, builtins, import state, and execution context. It shares the same operating system process with other interpreters, but it is logically separated from them at the Python runtime level.

A normal Python program usually runs with one main interpreter:

process
    main interpreter
        modules
        builtins
        sys
        thread states
        frames

With subinterpreters, the same process can contain more than one interpreter:

process
    interpreter A
        modules
        builtins
        sys
        thread states

    interpreter B
        modules
        builtins
        sys
        thread states

Subinterpreters are an advanced CPython feature. They sit between threads and processes: lighter than separate processes, more isolated than ordinary threads, but still constrained by object sharing, extension module state, and runtime-global resources.

48.1 Interpreter State

At the C level, CPython keeps interpreter-specific state in an interpreter state object. A simplified conceptual model is:

PyInterpreterState
    modules
    builtins
    import state
    codec state
    runtime configuration
    garbage collector state
    thread states
    pending calls
    audit hooks
    interpreter-specific caches

Each interpreter has one or more thread states.

PyInterpreterState
    PyThreadState
        current frame
        exception state
        recursion depth
        tracing state
        context state

A thread state belongs to exactly one interpreter at a time. Python code executes through a thread state attached to an interpreter.

48.2 Main Interpreter

The main interpreter is created during CPython startup.

It initializes core runtime objects, builtins, sys, import machinery, standard streams, and the execution environment needed to run user code.

Most Python programs only use this interpreter.

CPython startup
    initialize runtime
    create main interpreter
    initialize import system
    run startup configuration
    execute user script or module

Subinterpreters are additional interpreters created after the runtime has started.

48.3 Why Subinterpreters Exist

Subinterpreters provide isolation inside one process.

Useful goals include:

run independent Python execution contexts
isolate module globals
avoid some shared-state interference
host plugins with separate module imports
support embedded Python use cases
reduce overhead compared with processes
enable future parallel execution models

An embedding application may want to run multiple independent Python scripts inside one process without giving each script the same module dictionary and global state.

A server may want isolated plugin environments.

A runtime may want lower-overhead concurrency than multiprocessing.

48.4 Subinterpreters vs Threads

Threads share one interpreter by default.

one interpreter
    thread A
    thread B
    shared sys.modules
    shared module globals

Subinterpreters separate interpreter state.

interpreter A
    thread A
    sys.modules A

interpreter B
    thread B
    sys.modules B

Ordinary threads inside one interpreter share imported modules and module globals.

Subinterpreters have separate module imports. Importing json in interpreter A and importing json in interpreter B creates separate module objects for each interpreter.

48.5 Subinterpreters vs Processes

Processes are isolated by the operating system.

Subinterpreters are isolated by CPython inside one process.

Feature	Subinterpreters	Processes
Address space	Shared process address space	Separate address spaces
Python module state	Separate per interpreter	Separate per process
Crash isolation	Weak	Strong
Memory sharing	Possible but constrained	Explicit shared memory or IPC
Startup cost	Lower	Higher
Native extension risk	Shared process risk	Process-local risk
OS-level isolation	No	Yes

Subinterpreters are not a security boundary. Native code, process-global state, file descriptors, environment variables, and memory corruption can cross interpreter boundaries.

48.6 Separate `sys.modules`

Each interpreter has its own sys.modules.

Conceptually:

interpreter A:
    sys.modules["config"] -> module object A

interpreter B:
    sys.modules["config"] -> module object B

This means module globals are separate.

If config.py contains:

value = 0

then interpreter A can set:

config.value = 10

while interpreter B has its own config.value.

This separation is one of the main benefits of subinterpreters.

48.7 Separate Builtins

Each interpreter has its own builtins module.

This matters because modifying builtins in one interpreter should not affect another interpreter.

Example concept:

# interpreter A
import builtins
builtins.custom_name = 123

Interpreter B should not see that custom_name in its own builtins.

This supports better isolation for embedded execution environments.

48.8 Separate Import State

Each interpreter has import machinery state.

This includes:

sys.modules
sys.path
sys.meta_path
sys.path_hooks
path importer cache
import-related locks and state

Two interpreters can have different import paths.

Interpreter A may import modules from one plugin directory.

Interpreter B may import modules from another.

This allows an embedding host to create independent import environments inside one process.

48.9 Shared Runtime Resources

Not everything is per-interpreter.

Some resources are process-global or runtime-global.

Examples include:

operating system process
file descriptors
environment variables
native library global state
some memory allocators
some runtime-wide caches
some static C data
loaded shared libraries

This is why subinterpreters provide runtime isolation, not full process isolation.

If a C extension uses a global static variable, that variable may be shared across interpreters unless the extension is designed for per-interpreter state.

48.10 Extension Module State

Extension modules are one of the hardest parts of subinterpreter isolation.

A Python module written in Python naturally gets a separate module object per interpreter.

A C extension may store state in process-global C variables:

static PyObject *global_cache;

That state is shared across interpreters.

This can break isolation.

Better extension design stores state per module object.

Modern extension modules can use multi-phase initialization and per-module state to avoid process-global mutable state.

Conceptually:

bad:
    one C global cache shared by all interpreters

better:
    interpreter A module object -> module state A
    interpreter B module object -> module state B

48.11 Single-Phase Extension Initialization

Older extension modules commonly use single-phase initialization.

The module initialization function creates and returns a module object in one step.

This style often encourages global state.

Simplified shape:

static PyObject *cache;

PyMODINIT_FUNC
PyInit_example(void)
{
    cache = PyDict_New();
    return PyModule_Create(&moduledef);
}

This may work in a single main interpreter, but it can behave badly when imported in multiple interpreters.

Problems include:

shared mutable state
incorrect object ownership across interpreters
shutdown order bugs
reload bugs
cross-interpreter reference leaks

48.12 Multi-Phase Extension Initialization

Multi-phase initialization separates module creation from module execution.

It allows a C extension to allocate module-specific state and behave more like Python modules.

A simplified conceptual shape:

create module object
allocate per-module state
execute module initialization
store state on module object

Benefits:

better subinterpreter support
cleaner module reload behavior
less process-global mutable state
more explicit lifetime management

For subinterpreters, multi-phase initialization is usually the preferred design.

Ordinary Python objects generally cannot be freely shared between interpreters.

An object belongs to an interpreter context. It may reference interpreter-specific state such as:

type objects
module globals
interned strings
allocation state
weakrefs
finalizers
thread state assumptions

Sharing such an object directly with another interpreter can violate runtime invariants.

Safe cross-interpreter communication usually requires copying, serialization, or specially supported shareable objects.

48.14 Shareable Data

Some data can be safely transferred between interpreters because it is immutable or has special support.

Examples of conceptually shareable data include:

None
booleans
integers in supported paths
strings in supported paths
bytes in supported paths
channels or explicit communication objects
serialized messages

The exact supported set depends on the API and CPython version.

The important design rule is:

do not assume ordinary Python objects can cross interpreter boundaries

Use explicit communication mechanisms.

48.15 Communication Between Subinterpreters

Subinterpreters need communication mechanisms because they do not share ordinary module globals.

Possible designs include:

message passing
channels
serialized bytes
queues implemented by the host
shared memory with explicit synchronization
files or sockets
embedding host callbacks

The safest model is message passing.

interpreter A
    serialize message
    send message

interpreter B
    receive message
    deserialize message

This avoids direct object sharing and preserves isolation.

48.16 Channels

A channel is a communication primitive for passing data between interpreters.

Conceptually:

channel
    send(value)
    receive() -> value

A channel can enforce that only supported shareable values cross interpreter boundaries.

This gives a more controlled model than exposing arbitrary object references.

The design resembles process IPC more than ordinary thread sharing.

48.17 Subinterpreters and the GIL

Historically, CPython’s GIL was effectively process-wide for normal execution, so subinterpreters did not provide true parallel Python bytecode execution in the usual build.

Modern CPython work includes per-interpreter GIL and free-threaded designs.

The distinction matters:

single global GIL:
    subinterpreters isolate state but do not run Python bytecode in parallel

per-interpreter GIL:
    each interpreter can have its own GIL
    different interpreters may execute Python bytecode concurrently

free-threaded build:
    Python bytecode can execute in parallel without a traditional GIL

Subinterpreters are part of the path toward better in-process concurrency, but they require extension modules and runtime state to be isolated correctly.

48.18 Per-Interpreter GIL

A per-interpreter GIL means each interpreter has its own lock.

Conceptually:

interpreter A
    GIL A

interpreter B
    GIL B

Thread A running in interpreter A can hold GIL A.

Thread B running in interpreter B can hold GIL B.

This can allow parallel Python execution across interpreters, assuming no unsafe shared runtime state blocks it.

But per-interpreter GIL only works correctly if extension modules avoid shared mutable C globals or explicitly protect them.

48.19 Subinterpreters and Free-Threaded CPython

Free-threaded CPython removes the traditional GIL from a build configuration.

Subinterpreters remain useful in such a runtime because they provide isolation boundaries, not only parallelism.

In a free-threaded runtime, hard problems include:

safe reference counting
container synchronization
object ownership
cross-interpreter object rules
extension module compatibility
garbage collector safety
memory allocator behavior

Subinterpreters and free-threading solve related but different problems.

Subinterpreters isolate execution contexts.

Free-threading changes synchronization inside those contexts.

48.20 Creating Subinterpreters From C

The classic subinterpreter API is a C API.

Conceptual operations:

create new interpreter
get new thread state
run code inside that interpreter
switch back to previous thread state
destroy interpreter

A simplified C-level shape:

PyThreadState *main_tstate = PyThreadState_Get();

PyThreadState *sub_tstate = Py_NewInterpreter();

/* execute code in subinterpreter */

Py_EndInterpreter(sub_tstate);

PyThreadState_Swap(main_tstate);

Actual embedding code must handle errors, GIL state, thread state transitions, and shutdown carefully.

48.21 Switching Thread State

A native thread executing Python code has a current thread state.

To run code in a subinterpreter, native embedding code must switch to a thread state associated with that interpreter.

Conceptually:

current thread state -> interpreter A

switch

current thread state -> interpreter B

Using the wrong thread state with the wrong objects can corrupt runtime assumptions.

This is why subinterpreters are mostly an embedding and advanced runtime feature rather than a normal everyday Python API.

48.22 Running Code in a Subinterpreter

An embedding host may run source code in a subinterpreter.

Conceptually:

create interpreter
initialize sys.path
run source string or file
collect result or side effects
destroy interpreter

The code runs with that interpreter’s modules and globals.

A simple embedding model:

host application
    create interpreter for tenant A
    run tenant A script
    destroy interpreter

    create interpreter for tenant B
    run tenant B script
    destroy interpreter

This avoids reusing one global module state for all tenants.

Again, this is isolation for organization and runtime state, not security isolation.

48.23 Subinterpreters Are Not Sandboxes

Subinterpreters should not be treated as security sandboxes.

Reasons:

same process memory
same native extension address space
same file descriptors unless restricted by host
same operating system identity
native crashes affect whole process
process-global C state can leak
resource exhaustion affects whole process

Untrusted code should run in a separate process, container, virtual machine, or another security boundary.

Subinterpreters are useful for isolation inside trusted or semi-trusted runtime designs.

48.24 Module Globals in Subinterpreters

Module globals are interpreter-local when the module is loaded separately in each interpreter.

For Python source modules, this is natural:

interpreter A:
    module object A
    module.__dict__ A

interpreter B:
    module object B
    module.__dict__ B

This means module-level caches, registries, and configuration can differ per interpreter.

But if the module is backed by a C extension with global state, the apparent Python module separation may hide shared C state.

48.25 Built-in Modules

Built-in modules must be designed carefully for subinterpreters.

Some built-in modules have process-wide behavior.

Others maintain per-interpreter state.

Runtime modules such as sys must be per-interpreter because each interpreter needs its own module table, import path, and standard stream references.

A good mental model:

sys is interpreter-local
builtins is interpreter-local
some low-level runtime resources may be process-global
extension module state depends on implementation

48.26 Object Finalization Across Interpreters

Objects should generally be finalized in the interpreter where they belong.

Finalizers may execute Python code, access module globals, call weakref callbacks, or interact with interpreter state.

Cross-interpreter references make finalization difficult because an object in interpreter A might be destroyed while interpreter A is shutting down, or while code in interpreter B still holds an invalid pointer.

This is another reason arbitrary object sharing across interpreters is restricted.

48.27 Garbage Collection

Each interpreter can have its own garbage collector state for interpreter-local objects.

The collector must traverse object graphs that belong to the interpreter.

Cross-interpreter references would complicate collection because a graph could span interpreter boundaries.

Therefore, keeping object graphs interpreter-local makes garbage collection tractable.

Communication through serialized or explicitly shareable data avoids cross-interpreter GC cycles.

48.28 Exceptions

Exceptions are objects too.

An exception raised in one interpreter cannot simply be thrown as the same object into another interpreter.

A host that communicates errors between interpreters should transfer structured error information:

exception type name
message
traceback text or structured frames
error code
serialized context

The receiving interpreter can reconstruct an appropriate local exception if needed.

This is similar to process boundary error handling.

48.29 Tracebacks

Tracebacks reference frames, code objects, globals, and local variables.

These are deeply interpreter-specific.

Passing tracebacks directly across interpreters is unsafe as a general model.

Instead, convert tracebacks to text or a structured neutral format:

import traceback

try:
    run_code()
except Exception:
    text = traceback.format_exc()

Then pass the string or structured representation.

48.30 Standard Streams

Each interpreter can have its own sys.stdout, sys.stderr, and sys.stdin references.

But the underlying file descriptors may be process-global.

Interpreter A can assign:

sys.stdout = custom_writer_a

Interpreter B can assign:

sys.stdout = custom_writer_b

At the Python level these are separate. At the operating system level, both may still write to the same process output unless redirected by the host.

48.31 Environment Variables

Environment variables are process-global.

If one interpreter calls:

import os
os.environ["MODE"] = "test"

another interpreter in the same process may observe the changed process environment.

This is a key difference from process isolation.

Do not use subinterpreters when process-global mutation must be isolated.

48.32 File Descriptors and Sockets

File descriptors and sockets belong to the process.

Two interpreters can potentially access the same descriptor if references or descriptor numbers are shared.

This can cause interference.

For robust designs, the host should assign resources explicitly:

interpreter A gets descriptor A
interpreter B gets descriptor B
shared descriptors are coordinated by host locks or protocol

Subinterpreters do not automatically virtualize operating system resources.

48.33 Signals

Signals are process-level events.

Python signal handling is tied to the main thread and main interpreter behavior in many designs.

A subinterpreter should not be treated as an independent process with its own independent signal universe.

If code requires isolated signal behavior, use processes.

48.34 Auditing and Monitoring

Subinterpreters can be useful for hosts that want separate execution contexts but centralized monitoring.

The host can track:

interpreter creation
interpreter destruction
code execution
resource assignment
message passing
failure reports
execution time

But enforcement must be explicit. CPython does not automatically impose CPU, memory, filesystem, or network limits per interpreter.

48.35 Subinterpreters in Embedded Applications

Embedding is one of the natural uses for subinterpreters.

A host application written in C or C++ may embed Python and run scripts.

Example uses:

game scripting
database stored procedures
application plugins
simulation systems
data processing plugins
automation runtimes

The host can create an interpreter per plugin or per task.

This helps isolate module globals and plugin imports.

The host must still manage native extension safety, resource ownership, and shutdown order.

48.36 Subinterpreters in Servers

A server might use subinterpreters to isolate tenants, apps, or plugins.

Possible architecture:

server process
    interpreter for app A
    interpreter for app B
    interpreter for app C

Benefits:

separate sys.modules
separate app globals
lower overhead than processes
possible in-process message passing

Risks:

one crash can kill all apps
native extension state may leak
process-global environment is shared
resource limits are hard
debugging is more complex

For strong multi-tenant isolation, processes are safer.

48.37 Interpreter Shutdown

Destroying a subinterpreter must clean up its modules, objects, thread states, and interpreter-specific resources.

Shutdown is difficult because:

objects may have finalizers
daemon-like activity may still exist
extension modules may hold state
threads may still refer to interpreter objects
module globals may be cleared
weakref callbacks may run

A robust embedding host should stop all activity in the subinterpreter before destroying it.

48.38 Threads Inside Subinterpreters

An interpreter can have thread states for threads executing inside it.

In traditional CPython, all threads still coordinate through the GIL model of that build.

With per-interpreter GIL, threads in different interpreters can potentially execute Python bytecode concurrently.

But a single interpreter still needs internal synchronization.

The model is:

interpreter A
    thread state A1
    thread state A2

interpreter B
    thread state B1

Each thread state belongs to one interpreter.

48.39 Moving Threads Between Interpreters

A native OS thread can switch between interpreter thread states in embedding scenarios, but this must be done carefully.

Python-level threads are normally created to run in a specific interpreter context.

Do not design ordinary Python code around moving a thread freely between interpreters.

The C API gives power here, but incorrect use can corrupt state or crash the process.

48.40 Subinterpreters and `atexit`

atexit handlers are tied to interpreter shutdown behavior.

A handler registered in one interpreter should be considered local to that interpreter’s lifecycle.

But if the handler touches process-global resources, it can still affect other interpreters.

Example risk:

import atexit
import os

atexit.register(lambda: os.environ.clear())

This would mutate process-global environment state during shutdown.

48.41 Subinterpreters and Logging

The logging module imported in separate interpreters has separate Python module state.

But logging handlers may write to shared process resources:

same file
same stderr
same socket
same external logging service

If two interpreters write to the same file handler or descriptor, coordination may be needed outside the module state.

The module is separate. The resource may not be.

48.42 Subinterpreters and Randomness

Python module state for random generators may be separate if each interpreter imports its own module.

But operating system randomness sources are shared process or system resources.

This distinction appears often:

Python-level state can be interpreter-local
OS-level state is outside interpreter isolation

The same principle applies to time, locale, environment, current working directory, and process ID.

48.43 Current Working Directory

The current working directory is process-global.

If one interpreter calls:

import os
os.chdir("/tmp")

it changes the working directory for the whole process.

Another interpreter using relative paths will observe the change.

This is one reason embedded hosts should prefer absolute paths and avoid allowing arbitrary chdir in subinterpreters.

48.44 Locale

Process locale can be global or at least shared in ways that are not interpreter-local.

Code that changes locale may affect other interpreters.

For isolated locale behavior, use explicit locale-aware APIs or separate processes.

48.45 Memory Limits

Subinterpreters do not automatically provide separate memory limits.

A memory allocation in one interpreter consumes memory from the same process.

If interpreter A allocates a huge list, interpreter B can be affected because the process may run out of memory.

A host that needs memory isolation must implement monitoring or use processes.

48.46 Failure Isolation

If pure Python code raises an exception in one interpreter, the host can catch and report it.

If native code segfaults, the entire process usually crashes.

Subinterpreters do not protect against memory corruption from C extensions.

This is a major difference from multiprocessing.

Use processes when crash isolation matters.

48.47 Practical Design Rules

Use subinterpreters when you need:

separate module state
lower overhead than processes
embedding support
plugin isolation inside trusted process
structured in-process execution contexts
possible future parallelism through per-interpreter GIL

Avoid subinterpreters when you need:

security sandboxing
crash isolation
hard memory limits
independent environment variables
independent current working directories
untrusted native extensions
simple operational debugging

Subinterpreters are a runtime isolation mechanism, not an operating system isolation mechanism.

48.48 C Extension Rules for Subinterpreter Safety

C extensions should:

avoid mutable process-global state
use multi-phase initialization
store state per module object
avoid cross-interpreter object references
avoid static borrowed object caches
clear module state correctly
handle repeated initialization and finalization
support per-interpreter GIL assumptions
protect native shared state explicitly

Extensions that assume one global interpreter are harder to use safely with subinterpreters.

48.49 Minimal Mental Model

Use this model:

A CPython process can contain multiple interpreters.

Each interpreter has its own sys.modules, builtins, import state, and thread states.

Ordinary Python module globals are separated per interpreter.

The operating system process is still shared.

C extension globals may still be shared.

Ordinary Python objects should not be freely shared across interpreters.

Communication should use explicit message passing or supported shareable objects.

Subinterpreters isolate runtime state, not security or crashes.

48.50 Key Points

A subinterpreter is a separate CPython interpreter inside the same process.

Each interpreter has its own module table, builtins, import state, and execution context.

Subinterpreters are lighter than processes but provide weaker isolation.

They are stronger than ordinary threads for module-global isolation.

They are not security sandboxes.

C extensions are the hardest part of subinterpreter correctness because process-global native state can leak across interpreters.

Communication should use explicit channels, serialization, or supported shareable data.

Per-interpreter GIL and free-threaded CPython make subinterpreters increasingly important for CPython’s concurrency model.

48. Subinterpreters

48. Subinterpreters

48.1 Interpreter State

48.2 Main Interpreter

48.3 Why Subinterpreters Exist

48.4 Subinterpreters vs Threads

48.5 Subinterpreters vs Processes

48.6 Separate sys.modules

48.7 Separate Builtins

48.8 Separate Import State

48.9 Shared Runtime Resources

48.10 Extension Module State

48.11 Single-Phase Extension Initialization

48.12 Multi-Phase Extension Initialization

48.13 Cross-Interpreter Object Sharing

48.14 Shareable Data

48.15 Communication Between Subinterpreters

48.16 Channels

48.17 Subinterpreters and the GIL

48.18 Per-Interpreter GIL

48.19 Subinterpreters and Free-Threaded CPython

48.20 Creating Subinterpreters From C

48.21 Switching Thread State

48.22 Running Code in a Subinterpreter

48.23 Subinterpreters Are Not Sandboxes

48.24 Module Globals in Subinterpreters

48.25 Built-in Modules

48.26 Object Finalization Across Interpreters

48.27 Garbage Collection

48.28 Exceptions

48.29 Tracebacks

48.30 Standard Streams

48.31 Environment Variables

48.32 File Descriptors and Sockets

48.33 Signals

48.34 Auditing and Monitoring

48.35 Subinterpreters in Embedded Applications

48.36 Subinterpreters in Servers

48.37 Interpreter Shutdown

48.38 Threads Inside Subinterpreters

48.39 Moving Threads Between Interpreters

48.40 Subinterpreters and atexit

48.41 Subinterpreters and Logging

48.42 Subinterpreters and Randomness

48.43 Current Working Directory

48.44 Locale

48.45 Memory Limits

48.46 Failure Isolation

48.47 Practical Design Rules

48.48 C Extension Rules for Subinterpreter Safety

48.49 Minimal Mental Model

48.50 Key Points

48.6 Separate `sys.modules`

48.40 Subinterpreters and `atexit`