# 48. Subinterpreters

# 48. Subinterpreters

A subinterpreter is an interpreter instance inside the same CPython process. It has its own interpreter state, module table, builtins, import state, and execution context. It shares the same operating system process with other interpreters, but it is logically separated from them at the Python runtime level.

A normal Python program usually runs with one main interpreter:

```text
process
    main interpreter
        modules
        builtins
        sys
        thread states
        frames
```

With subinterpreters, the same process can contain more than one interpreter:

```text
process
    interpreter A
        modules
        builtins
        sys
        thread states

    interpreter B
        modules
        builtins
        sys
        thread states
```

Subinterpreters are an advanced CPython feature. They sit between threads and processes: lighter than separate processes, more isolated than ordinary threads, but still constrained by object sharing, extension module state, and runtime-global resources.

## 48.1 Interpreter State

At the C level, CPython keeps interpreter-specific state in an interpreter state object. A simplified conceptual model is:

```text
PyInterpreterState
    modules
    builtins
    import state
    codec state
    runtime configuration
    garbage collector state
    thread states
    pending calls
    audit hooks
    interpreter-specific caches
```

Each interpreter has one or more thread states.

```text
PyInterpreterState
    PyThreadState
        current frame
        exception state
        recursion depth
        tracing state
        context state
```

A thread state belongs to exactly one interpreter at a time. Python code executes through a thread state attached to an interpreter.

## 48.2 Main Interpreter

The main interpreter is created during CPython startup.

It initializes core runtime objects, builtins, `sys`, import machinery, standard streams, and the execution environment needed to run user code.

Most Python programs only use this interpreter.

```text
CPython startup
    initialize runtime
    create main interpreter
    initialize import system
    run startup configuration
    execute user script or module
```

Subinterpreters are additional interpreters created after the runtime has started.

## 48.3 Why Subinterpreters Exist

Subinterpreters provide isolation inside one process.

Useful goals include:

```text
run independent Python execution contexts
isolate module globals
avoid some shared-state interference
host plugins with separate module imports
support embedded Python use cases
reduce overhead compared with processes
enable future parallel execution models
```

An embedding application may want to run multiple independent Python scripts inside one process without giving each script the same module dictionary and global state.

A server may want isolated plugin environments.

A runtime may want lower-overhead concurrency than multiprocessing.

## 48.4 Subinterpreters vs Threads

Threads share one interpreter by default.

```text
one interpreter
    thread A
    thread B
    shared sys.modules
    shared module globals
```

Subinterpreters separate interpreter state.

```text
interpreter A
    thread A
    sys.modules A

interpreter B
    thread B
    sys.modules B
```

Ordinary threads inside one interpreter share imported modules and module globals.

Subinterpreters have separate module imports. Importing `json` in interpreter A and importing `json` in interpreter B creates separate module objects for each interpreter.

## 48.5 Subinterpreters vs Processes

Processes are isolated by the operating system.

Subinterpreters are isolated by CPython inside one process.

| Feature | Subinterpreters | Processes |
|---|---|---|
| Address space | Shared process address space | Separate address spaces |
| Python module state | Separate per interpreter | Separate per process |
| Crash isolation | Weak | Strong |
| Memory sharing | Possible but constrained | Explicit shared memory or IPC |
| Startup cost | Lower | Higher |
| Native extension risk | Shared process risk | Process-local risk |
| OS-level isolation | No | Yes |

Subinterpreters are not a security boundary. Native code, process-global state, file descriptors, environment variables, and memory corruption can cross interpreter boundaries.

## 48.6 Separate `sys.modules`

Each interpreter has its own `sys.modules`.

Conceptually:

```text
interpreter A:
    sys.modules["config"] -> module object A

interpreter B:
    sys.modules["config"] -> module object B
```

This means module globals are separate.

If `config.py` contains:

```python
value = 0
```

then interpreter A can set:

```python
config.value = 10
```

while interpreter B has its own `config.value`.

This separation is one of the main benefits of subinterpreters.

## 48.7 Separate Builtins

Each interpreter has its own `builtins` module.

This matters because modifying builtins in one interpreter should not affect another interpreter.

Example concept:

```python
# interpreter A
import builtins
builtins.custom_name = 123
```

Interpreter B should not see that `custom_name` in its own builtins.

This supports better isolation for embedded execution environments.

## 48.8 Separate Import State

Each interpreter has import machinery state.

This includes:

```text
sys.modules
sys.path
sys.meta_path
sys.path_hooks
path importer cache
import-related locks and state
```

Two interpreters can have different import paths.

Interpreter A may import modules from one plugin directory.

Interpreter B may import modules from another.

This allows an embedding host to create independent import environments inside one process.

## 48.9 Shared Runtime Resources

Not everything is per-interpreter.

Some resources are process-global or runtime-global.

Examples include:

```text
operating system process
file descriptors
environment variables
native library global state
some memory allocators
some runtime-wide caches
some static C data
loaded shared libraries
```

This is why subinterpreters provide runtime isolation, not full process isolation.

If a C extension uses a global static variable, that variable may be shared across interpreters unless the extension is designed for per-interpreter state.

## 48.10 Extension Module State

Extension modules are one of the hardest parts of subinterpreter isolation.

A Python module written in Python naturally gets a separate module object per interpreter.

A C extension may store state in process-global C variables:

```c
static PyObject *global_cache;
```

That state is shared across interpreters.

This can break isolation.

Better extension design stores state per module object.

Modern extension modules can use multi-phase initialization and per-module state to avoid process-global mutable state.

Conceptually:

```text
bad:
    one C global cache shared by all interpreters

better:
    interpreter A module object -> module state A
    interpreter B module object -> module state B
```

## 48.11 Single-Phase Extension Initialization

Older extension modules commonly use single-phase initialization.

The module initialization function creates and returns a module object in one step.

This style often encourages global state.

Simplified shape:

```c
static PyObject *cache;

PyMODINIT_FUNC
PyInit_example(void)
{
    cache = PyDict_New();
    return PyModule_Create(&moduledef);
}
```

This may work in a single main interpreter, but it can behave badly when imported in multiple interpreters.

Problems include:

```text
shared mutable state
incorrect object ownership across interpreters
shutdown order bugs
reload bugs
cross-interpreter reference leaks
```

## 48.12 Multi-Phase Extension Initialization

Multi-phase initialization separates module creation from module execution.

It allows a C extension to allocate module-specific state and behave more like Python modules.

A simplified conceptual shape:

```text
create module object
allocate per-module state
execute module initialization
store state on module object
```

Benefits:

```text
better subinterpreter support
cleaner module reload behavior
less process-global mutable state
more explicit lifetime management
```

For subinterpreters, multi-phase initialization is usually the preferred design.

## 48.13 Cross-Interpreter Object Sharing

Ordinary Python objects generally cannot be freely shared between interpreters.

An object belongs to an interpreter context. It may reference interpreter-specific state such as:

```text
type objects
module globals
interned strings
allocation state
weakrefs
finalizers
thread state assumptions
```

Sharing such an object directly with another interpreter can violate runtime invariants.

Safe cross-interpreter communication usually requires copying, serialization, or specially supported shareable objects.

## 48.14 Shareable Data

Some data can be safely transferred between interpreters because it is immutable or has special support.

Examples of conceptually shareable data include:

```text
None
booleans
integers in supported paths
strings in supported paths
bytes in supported paths
channels or explicit communication objects
serialized messages
```

The exact supported set depends on the API and CPython version.

The important design rule is:

```text
do not assume ordinary Python objects can cross interpreter boundaries
```

Use explicit communication mechanisms.

## 48.15 Communication Between Subinterpreters

Subinterpreters need communication mechanisms because they do not share ordinary module globals.

Possible designs include:

```text
message passing
channels
serialized bytes
queues implemented by the host
shared memory with explicit synchronization
files or sockets
embedding host callbacks
```

The safest model is message passing.

```text
interpreter A
    serialize message
    send message

interpreter B
    receive message
    deserialize message
```

This avoids direct object sharing and preserves isolation.

## 48.16 Channels

A channel is a communication primitive for passing data between interpreters.

Conceptually:

```text
channel
    send(value)
    receive() -> value
```

A channel can enforce that only supported shareable values cross interpreter boundaries.

This gives a more controlled model than exposing arbitrary object references.

The design resembles process IPC more than ordinary thread sharing.

## 48.17 Subinterpreters and the GIL

Historically, CPython’s GIL was effectively process-wide for normal execution, so subinterpreters did not provide true parallel Python bytecode execution in the usual build.

Modern CPython work includes per-interpreter GIL and free-threaded designs.

The distinction matters:

```text
single global GIL:
    subinterpreters isolate state but do not run Python bytecode in parallel

per-interpreter GIL:
    each interpreter can have its own GIL
    different interpreters may execute Python bytecode concurrently

free-threaded build:
    Python bytecode can execute in parallel without a traditional GIL
```

Subinterpreters are part of the path toward better in-process concurrency, but they require extension modules and runtime state to be isolated correctly.

## 48.18 Per-Interpreter GIL

A per-interpreter GIL means each interpreter has its own lock.

Conceptually:

```text
interpreter A
    GIL A

interpreter B
    GIL B
```

Thread A running in interpreter A can hold GIL A.

Thread B running in interpreter B can hold GIL B.

This can allow parallel Python execution across interpreters, assuming no unsafe shared runtime state blocks it.

But per-interpreter GIL only works correctly if extension modules avoid shared mutable C globals or explicitly protect them.

## 48.19 Subinterpreters and Free-Threaded CPython

Free-threaded CPython removes the traditional GIL from a build configuration.

Subinterpreters remain useful in such a runtime because they provide isolation boundaries, not only parallelism.

In a free-threaded runtime, hard problems include:

```text
safe reference counting
container synchronization
object ownership
cross-interpreter object rules
extension module compatibility
garbage collector safety
memory allocator behavior
```

Subinterpreters and free-threading solve related but different problems.

Subinterpreters isolate execution contexts.

Free-threading changes synchronization inside those contexts.

## 48.20 Creating Subinterpreters From C

The classic subinterpreter API is a C API.

Conceptual operations:

```text
create new interpreter
get new thread state
run code inside that interpreter
switch back to previous thread state
destroy interpreter
```

A simplified C-level shape:

```c
PyThreadState *main_tstate = PyThreadState_Get();

PyThreadState *sub_tstate = Py_NewInterpreter();

/* execute code in subinterpreter */

Py_EndInterpreter(sub_tstate);

PyThreadState_Swap(main_tstate);
```

Actual embedding code must handle errors, GIL state, thread state transitions, and shutdown carefully.

## 48.21 Switching Thread State

A native thread executing Python code has a current thread state.

To run code in a subinterpreter, native embedding code must switch to a thread state associated with that interpreter.

Conceptually:

```text
current thread state -> interpreter A

switch

current thread state -> interpreter B
```

Using the wrong thread state with the wrong objects can corrupt runtime assumptions.

This is why subinterpreters are mostly an embedding and advanced runtime feature rather than a normal everyday Python API.

## 48.22 Running Code in a Subinterpreter

An embedding host may run source code in a subinterpreter.

Conceptually:

```text
create interpreter
initialize sys.path
run source string or file
collect result or side effects
destroy interpreter
```

The code runs with that interpreter’s modules and globals.

A simple embedding model:

```text
host application
    create interpreter for tenant A
    run tenant A script
    destroy interpreter

    create interpreter for tenant B
    run tenant B script
    destroy interpreter
```

This avoids reusing one global module state for all tenants.

Again, this is isolation for organization and runtime state, not security isolation.

## 48.23 Subinterpreters Are Not Sandboxes

Subinterpreters should not be treated as security sandboxes.

Reasons:

```text
same process memory
same native extension address space
same file descriptors unless restricted by host
same operating system identity
native crashes affect whole process
process-global C state can leak
resource exhaustion affects whole process
```

Untrusted code should run in a separate process, container, virtual machine, or another security boundary.

Subinterpreters are useful for isolation inside trusted or semi-trusted runtime designs.

## 48.24 Module Globals in Subinterpreters

Module globals are interpreter-local when the module is loaded separately in each interpreter.

For Python source modules, this is natural:

```text
interpreter A:
    module object A
    module.__dict__ A

interpreter B:
    module object B
    module.__dict__ B
```

This means module-level caches, registries, and configuration can differ per interpreter.

But if the module is backed by a C extension with global state, the apparent Python module separation may hide shared C state.

## 48.25 Built-in Modules

Built-in modules must be designed carefully for subinterpreters.

Some built-in modules have process-wide behavior.

Others maintain per-interpreter state.

Runtime modules such as `sys` must be per-interpreter because each interpreter needs its own module table, import path, and standard stream references.

A good mental model:

```text
sys is interpreter-local
builtins is interpreter-local
some low-level runtime resources may be process-global
extension module state depends on implementation
```

## 48.26 Object Finalization Across Interpreters

Objects should generally be finalized in the interpreter where they belong.

Finalizers may execute Python code, access module globals, call weakref callbacks, or interact with interpreter state.

Cross-interpreter references make finalization difficult because an object in interpreter A might be destroyed while interpreter A is shutting down, or while code in interpreter B still holds an invalid pointer.

This is another reason arbitrary object sharing across interpreters is restricted.

## 48.27 Garbage Collection

Each interpreter can have its own garbage collector state for interpreter-local objects.

The collector must traverse object graphs that belong to the interpreter.

Cross-interpreter references would complicate collection because a graph could span interpreter boundaries.

Therefore, keeping object graphs interpreter-local makes garbage collection tractable.

Communication through serialized or explicitly shareable data avoids cross-interpreter GC cycles.

## 48.28 Exceptions

Exceptions are objects too.

An exception raised in one interpreter cannot simply be thrown as the same object into another interpreter.

A host that communicates errors between interpreters should transfer structured error information:

```text
exception type name
message
traceback text or structured frames
error code
serialized context
```

The receiving interpreter can reconstruct an appropriate local exception if needed.

This is similar to process boundary error handling.

## 48.29 Tracebacks

Tracebacks reference frames, code objects, globals, and local variables.

These are deeply interpreter-specific.

Passing tracebacks directly across interpreters is unsafe as a general model.

Instead, convert tracebacks to text or a structured neutral format:

```python
import traceback

try:
    run_code()
except Exception:
    text = traceback.format_exc()
```

Then pass the string or structured representation.

## 48.30 Standard Streams

Each interpreter can have its own `sys.stdout`, `sys.stderr`, and `sys.stdin` references.

But the underlying file descriptors may be process-global.

Interpreter A can assign:

```python
sys.stdout = custom_writer_a
```

Interpreter B can assign:

```python
sys.stdout = custom_writer_b
```

At the Python level these are separate. At the operating system level, both may still write to the same process output unless redirected by the host.

## 48.31 Environment Variables

Environment variables are process-global.

If one interpreter calls:

```python
import os
os.environ["MODE"] = "test"
```

another interpreter in the same process may observe the changed process environment.

This is a key difference from process isolation.

Do not use subinterpreters when process-global mutation must be isolated.

## 48.32 File Descriptors and Sockets

File descriptors and sockets belong to the process.

Two interpreters can potentially access the same descriptor if references or descriptor numbers are shared.

This can cause interference.

For robust designs, the host should assign resources explicitly:

```text
interpreter A gets descriptor A
interpreter B gets descriptor B
shared descriptors are coordinated by host locks or protocol
```

Subinterpreters do not automatically virtualize operating system resources.

## 48.33 Signals

Signals are process-level events.

Python signal handling is tied to the main thread and main interpreter behavior in many designs.

A subinterpreter should not be treated as an independent process with its own independent signal universe.

If code requires isolated signal behavior, use processes.

## 48.34 Auditing and Monitoring

Subinterpreters can be useful for hosts that want separate execution contexts but centralized monitoring.

The host can track:

```text
interpreter creation
interpreter destruction
code execution
resource assignment
message passing
failure reports
execution time
```

But enforcement must be explicit. CPython does not automatically impose CPU, memory, filesystem, or network limits per interpreter.

## 48.35 Subinterpreters in Embedded Applications

Embedding is one of the natural uses for subinterpreters.

A host application written in C or C++ may embed Python and run scripts.

Example uses:

```text
game scripting
database stored procedures
application plugins
simulation systems
data processing plugins
automation runtimes
```

The host can create an interpreter per plugin or per task.

This helps isolate module globals and plugin imports.

The host must still manage native extension safety, resource ownership, and shutdown order.

## 48.36 Subinterpreters in Servers

A server might use subinterpreters to isolate tenants, apps, or plugins.

Possible architecture:

```text
server process
    interpreter for app A
    interpreter for app B
    interpreter for app C
```

Benefits:

```text
separate sys.modules
separate app globals
lower overhead than processes
possible in-process message passing
```

Risks:

```text
one crash can kill all apps
native extension state may leak
process-global environment is shared
resource limits are hard
debugging is more complex
```

For strong multi-tenant isolation, processes are safer.

## 48.37 Interpreter Shutdown

Destroying a subinterpreter must clean up its modules, objects, thread states, and interpreter-specific resources.

Shutdown is difficult because:

```text
objects may have finalizers
daemon-like activity may still exist
extension modules may hold state
threads may still refer to interpreter objects
module globals may be cleared
weakref callbacks may run
```

A robust embedding host should stop all activity in the subinterpreter before destroying it.

## 48.38 Threads Inside Subinterpreters

An interpreter can have thread states for threads executing inside it.

In traditional CPython, all threads still coordinate through the GIL model of that build.

With per-interpreter GIL, threads in different interpreters can potentially execute Python bytecode concurrently.

But a single interpreter still needs internal synchronization.

The model is:

```text
interpreter A
    thread state A1
    thread state A2

interpreter B
    thread state B1
```

Each thread state belongs to one interpreter.

## 48.39 Moving Threads Between Interpreters

A native OS thread can switch between interpreter thread states in embedding scenarios, but this must be done carefully.

Python-level threads are normally created to run in a specific interpreter context.

Do not design ordinary Python code around moving a thread freely between interpreters.

The C API gives power here, but incorrect use can corrupt state or crash the process.

## 48.40 Subinterpreters and `atexit`

`atexit` handlers are tied to interpreter shutdown behavior.

A handler registered in one interpreter should be considered local to that interpreter’s lifecycle.

But if the handler touches process-global resources, it can still affect other interpreters.

Example risk:

```python
import atexit
import os

atexit.register(lambda: os.environ.clear())
```

This would mutate process-global environment state during shutdown.

## 48.41 Subinterpreters and Logging

The `logging` module imported in separate interpreters has separate Python module state.

But logging handlers may write to shared process resources:

```text
same file
same stderr
same socket
same external logging service
```

If two interpreters write to the same file handler or descriptor, coordination may be needed outside the module state.

The module is separate. The resource may not be.

## 48.42 Subinterpreters and Randomness

Python module state for random generators may be separate if each interpreter imports its own module.

But operating system randomness sources are shared process or system resources.

This distinction appears often:

```text
Python-level state can be interpreter-local
OS-level state is outside interpreter isolation
```

The same principle applies to time, locale, environment, current working directory, and process ID.

## 48.43 Current Working Directory

The current working directory is process-global.

If one interpreter calls:

```python
import os
os.chdir("/tmp")
```

it changes the working directory for the whole process.

Another interpreter using relative paths will observe the change.

This is one reason embedded hosts should prefer absolute paths and avoid allowing arbitrary `chdir` in subinterpreters.

## 48.44 Locale

Process locale can be global or at least shared in ways that are not interpreter-local.

Code that changes locale may affect other interpreters.

For isolated locale behavior, use explicit locale-aware APIs or separate processes.

## 48.45 Memory Limits

Subinterpreters do not automatically provide separate memory limits.

A memory allocation in one interpreter consumes memory from the same process.

If interpreter A allocates a huge list, interpreter B can be affected because the process may run out of memory.

A host that needs memory isolation must implement monitoring or use processes.

## 48.46 Failure Isolation

If pure Python code raises an exception in one interpreter, the host can catch and report it.

If native code segfaults, the entire process usually crashes.

Subinterpreters do not protect against memory corruption from C extensions.

This is a major difference from multiprocessing.

Use processes when crash isolation matters.

## 48.47 Practical Design Rules

Use subinterpreters when you need:

```text
separate module state
lower overhead than processes
embedding support
plugin isolation inside trusted process
structured in-process execution contexts
possible future parallelism through per-interpreter GIL
```

Avoid subinterpreters when you need:

```text
security sandboxing
crash isolation
hard memory limits
independent environment variables
independent current working directories
untrusted native extensions
simple operational debugging
```

Subinterpreters are a runtime isolation mechanism, not an operating system isolation mechanism.

## 48.48 C Extension Rules for Subinterpreter Safety

C extensions should:

```text
avoid mutable process-global state
use multi-phase initialization
store state per module object
avoid cross-interpreter object references
avoid static borrowed object caches
clear module state correctly
handle repeated initialization and finalization
support per-interpreter GIL assumptions
protect native shared state explicitly
```

Extensions that assume one global interpreter are harder to use safely with subinterpreters.

## 48.49 Minimal Mental Model

Use this model:

```text
A CPython process can contain multiple interpreters.

Each interpreter has its own sys.modules, builtins, import state, and thread states.

Ordinary Python module globals are separated per interpreter.

The operating system process is still shared.

C extension globals may still be shared.

Ordinary Python objects should not be freely shared across interpreters.

Communication should use explicit message passing or supported shareable objects.

Subinterpreters isolate runtime state, not security or crashes.
```

## 48.50 Key Points

A subinterpreter is a separate CPython interpreter inside the same process.

Each interpreter has its own module table, builtins, import state, and execution context.

Subinterpreters are lighter than processes but provide weaker isolation.

They are stronger than ordinary threads for module-global isolation.

They are not security sandboxes.

C extensions are the hardest part of subinterpreter correctness because process-global native state can leak across interpreters.

Communication should use explicit channels, serialization, or supported shareable data.

Per-interpreter GIL and free-threaded CPython make subinterpreters increasingly important for CPython’s concurrency model.
