Skip to content

69. Embedding Python

Py_Initialize, PyConfig, embedding CPython in a host process, and managing interpreter lifetime.

Embedding Python means starting and controlling the CPython runtime from a native application. Instead of Python importing a C extension, a C or C++ program initializes Python, executes Python code, exchanges objects with it, then shuts the interpreter down.

The direction is reversed:

extension module:
    Python process
        imports native code

embedding:
    native process
        starts Python runtime

Embedding is used when Python is the scripting layer inside a larger native program.

Common examples:

Host applicationPython role
Game engineGameplay scripting
Simulation systemUser-defined models
Database engineStored procedures
CAD applicationAutomation scripts
Editor or IDEPlugin runtime
Scientific applicationUser scripting shell
Server applianceConfiguration and policy logic

69.1 What Embedding Provides

Embedding allows a native program to:

initialize CPython
configure import paths
create Python objects
execute Python source
import Python modules
call Python functions
expose native objects to Python
handle Python exceptions
manage interpreter shutdown

A minimal embedded interpreter looks like this:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

int
main(int argc, char **argv)
{
    Py_Initialize();

    PyRun_SimpleString("print('hello from embedded Python')");

    Py_Finalize();
    return 0;
}

This starts the runtime, executes Python code, and shuts it down.

69.2 Embedding vs Extending

Embedding and extending use the same C API, but they solve different problems.

ModeControl flow
ExtendingPython calls C
EmbeddingC calls Python

Extension module:

Python code
C extension function
native library

Embedded application:

native main()
CPython runtime
Python module or script

Large systems often use both. A game engine may embed Python as its scripting layer while also exposing engine APIs through extension types.

69.3 Runtime Initialization

The old simple API is:

Py_Initialize();

It initializes the CPython runtime with mostly default configuration.

Modern embedding should prefer PyConfig for precise startup control.

Conceptually:

create config
set program name, paths, flags
initialize from config
run Python code
clear config

The modern API gives the host application control over:

module search paths
isolated mode
environment variable usage
site initialization
command-line arguments
filesystem encoding
standard streams
parser/runtime flags

This is important because embedded Python should usually behave predictably, independent of the user’s shell environment.

69.4 Modern Initialization with PyConfig

Example:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

int
main(int argc, char **argv)
{
    PyStatus status;
    PyConfig config;

    PyConfig_InitPythonConfig(&config);

    status = PyConfig_SetBytesString(
        &config,
        &config.program_name,
        argv[0]
    );
    if (PyStatus_Exception(status)) {
        PyConfig_Clear(&config);
        Py_ExitStatusException(status);
    }

    status = Py_InitializeFromConfig(&config);
    if (PyStatus_Exception(status)) {
        PyConfig_Clear(&config);
        Py_ExitStatusException(status);
    }

    PyConfig_Clear(&config);

    PyRun_SimpleString("print('embedded runtime ready')");

    if (Py_FinalizeEx() < 0) {
        return 120;
    }

    return 0;
}

PyStatus is used because initialization can fail before normal Python exception machinery is fully available.

69.5 Isolated Mode

An embedded application often should not read user environment variables, user site packages, or ambient shell configuration.

Use isolated configuration when you want a controlled runtime:

PyConfig config;
PyConfig_InitIsolatedConfig(&config);

Isolated mode helps avoid surprises from:

PYTHONPATH
PYTHONHOME
user site-packages
current working directory effects
environment-specific startup files

This is useful for applications that ship their own Python runtime and library set.

69.6 Setting Program Arguments

Embedded Python can receive sys.argv.

PyConfig config;
PyConfig_InitPythonConfig(&config);

PyWideStringList_Append(&config.argv, L"myapp");
PyWideStringList_Append(&config.argv, L"--script-mode");

After initialization:

import sys
print(sys.argv)

sees the configured values.

This matters when embedded scripts or imported libraries inspect command-line arguments.

69.7 Setting Import Paths

The import system needs a module search path.

For application-controlled embedding, set paths explicitly.

Conceptual example:

PyConfig config;
PyConfig_InitIsolatedConfig(&config);

config.module_search_paths_set = 1;

PyWideStringList_Append(
    &config.module_search_paths,
    L"/opt/myapp/python"
);

PyWideStringList_Append(
    &config.module_search_paths,
    L"/opt/myapp/python/lib-dynload"
);

This controls sys.path.

For embedded runtimes, import path management is one of the most common failure points.

Symptoms of bad path configuration:

cannot import encodings
cannot import site
cannot import application modules
extension modules not found
wrong Python standard library loaded

69.8 Running Python Source

The simplest execution helper is:

PyRun_SimpleString("print(1 + 2)");

It is convenient but coarse. It returns an integer status and prints uncaught exceptions to stderr.

For more control, compile and evaluate code manually.

PyObject *code;
PyObject *globals;
PyObject *result;

code = Py_CompileString(
    "x = 1 + 2\nx",
    "<embedded>",
    Py_eval_input
);

Common input modes:

ModeMeaning
Py_eval_inputSingle expression
Py_file_inputModule-like statement block
Py_single_inputInteractive statement

For production embedding, prefer APIs that let you inspect exceptions and return values.

69.9 Executing a Script File

A host application may run a Python script from disk.

Simplified pattern:

FILE *fp = fopen("script.py", "r");
if (fp == NULL) {
    perror("script.py");
    return 1;
}

int rc = PyRun_SimpleFile(fp, "script.py");
fclose(fp);

if (rc != 0) {
    return 1;
}

This resembles running:

python script.py

inside the host process.

For better error reporting, compile file contents manually and inspect exceptions rather than relying only on PyRun_SimpleFile.

69.10 Importing a Module

Embedding code can import Python modules directly.

PyObject *module;

module = PyImport_ImportModule("math");
if (module == NULL) {
    PyErr_Print();
    return 1;
}

Py_DECREF(module);

For application scripts:

PyObject *module = PyImport_ImportModule("plugins.startup");

Import depends on sys.path, import hooks, extension module loading, and package layout. Bad startup configuration usually appears first as import failure.

69.11 Calling a Python Function

A host program can import a module, get a function, build arguments, and call it.

PyObject *module = NULL;
PyObject *func = NULL;
PyObject *args = NULL;
PyObject *result = NULL;

module = PyImport_ImportModule("plugin");
if (module == NULL) {
    goto error;
}

func = PyObject_GetAttrString(module, "run");
if (func == NULL) {
    goto error;
}

if (!PyCallable_Check(func)) {
    PyErr_SetString(PyExc_TypeError, "plugin.run is not callable");
    goto error;
}

args = PyTuple_Pack(1, PyUnicode_FromString("input.txt"));
if (args == NULL) {
    goto error;
}

result = PyObject_CallObject(func, args);
if (result == NULL) {
    goto error;
}

/* use result */

Py_DECREF(result);
Py_DECREF(args);
Py_DECREF(func);
Py_DECREF(module);
return 0;

error:
PyErr_Print();
Py_XDECREF(result);
Py_XDECREF(args);
Py_XDECREF(func);
Py_XDECREF(module);
return 1;

The code above has a subtle leak: PyUnicode_FromString("input.txt") returns a new reference passed directly into PyTuple_Pack, which increments references rather than stealing them.

Safer version:

PyObject *arg0 = NULL;
PyObject *args = NULL;

arg0 = PyUnicode_FromString("input.txt");
if (arg0 == NULL) {
    goto error;
}

args = PyTuple_Pack(1, arg0);
Py_DECREF(arg0);

if (args == NULL) {
    goto error;
}

Embedding code follows the same ownership rules as extension code.

69.12 Converting Python Results to C Values

After a call, convert the returned object.

long value = PyLong_AsLong(result);
if (value == -1 && PyErr_Occurred()) {
    goto error;
}

For strings:

PyObject *utf8 = PyUnicode_AsUTF8String(result);
if (utf8 == NULL) {
    goto error;
}

char *s = PyBytes_AsString(utf8);
if (s == NULL) {
    Py_DECREF(utf8);
    goto error;
}

/* use s before DECREF */

Py_DECREF(utf8);

The pointer returned by PyBytes_AsString is valid only while the bytes object remains alive.

69.13 Exposing Host Functions to Python

Embedding becomes much more useful when Python code can call back into the host application.

You can define a built-in module and append it to the inittab before initialization.

static PyObject *
host_log(PyObject *self, PyObject *args)
{
    const char *msg;

    if (!PyArg_ParseTuple(args, "s", &msg)) {
        return NULL;
    }

    fprintf(stderr, "[host] %s\n", msg);
    Py_RETURN_NONE;
}

static PyMethodDef HostMethods[] = {
    {"log", host_log, METH_VARARGS, "Write a host log message"},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef HostModule = {
    PyModuleDef_HEAD_INIT,
    "host",
    "Host application API",
    -1,
    HostMethods
};

PyMODINIT_FUNC
PyInit_host(void)
{
    return PyModule_Create(&HostModule);
}

Register before Py_Initialize:

if (PyImport_AppendInittab("host", PyInit_host) < 0) {
    return 1;
}

Py_Initialize();

Python code can then call:

import host
host.log("hello from script")

This is the standard pattern for exposing application services into embedded Python.

69.14 Exposing Host Objects

For richer integration, define extension types that wrap native host objects.

Example:

C++ EngineObject *
PyObject wrapper
Python script sees EngineObject

The wrapper type handles:

method dispatch
lifetime
ownership
thread checks
attribute access
error translation

Important design question:

Ownership modelMeaning
Python owns native objectDealloc frees native resource
Host owns native objectPython wrapper borrows or references it
Shared ownershipReference-counted native handle
Weak handlePython wrapper validates object still exists

Most embedding bugs come from unclear ownership between the host object model and the Python object model.

69.15 Handling Python Exceptions

If a Python call fails, the C API usually returns NULL and leaves exception state set.

For simple applications:

if (result == NULL) {
    PyErr_Print();
}

For structured handling:

PyObject *type = NULL;
PyObject *value = NULL;
PyObject *traceback = NULL;

PyErr_Fetch(&type, &value, &traceback);
PyErr_NormalizeException(&type, &value, &traceback);

/* inspect, log, or convert */

Py_XDECREF(type);
Py_XDECREF(value);
Py_XDECREF(traceback);

A host application should usually convert Python exceptions into its own diagnostic system rather than printing directly to stderr.

69.16 Threading and the GIL

Most Python C API calls require the GIL.

If the host application calls Python from multiple native threads, each thread must acquire the GIL before interacting with Python objects.

Pattern:

PyGILState_STATE state;

state = PyGILState_Ensure();

/* call Python APIs */

PyGILState_Release(state);

This is the usual interface for calling into Python from native threads managed by the host application.

69.17 Long-Running Native Work

When Python calls into host functions that perform long native work, release the GIL around code that does not touch Python objects.

static PyObject *
host_compute(PyObject *self, PyObject *args)
{
    Py_BEGIN_ALLOW_THREADS

    run_long_native_compute();

    Py_END_ALLOW_THREADS

    Py_RETURN_NONE;
}

While the GIL is released, do not:

create Python objects
inspect Python objects
call Python functions
raise Python exceptions
touch reference counts

Only operate on native memory that is safe independently of Python object mutation.

69.18 Subinterpreters in Embedded Applications

An embedding application may create subinterpreters to isolate scripts.

Conceptually:

process
    runtime
        interpreter A
        interpreter B

Each interpreter has separate module dictionaries and runtime state, but they still share the same process and many native resources.

Subinterpreters are useful for isolation, but they are not process sandboxes.

Shared hazards:

native global variables
extension module static state
file descriptors
process environment
allocator state
C library globals
OS process privileges

If untrusted code must be isolated, use an operating system process boundary.

69.19 Finalization

Shutting down Python is harder than starting it.

Use:

int rc = Py_FinalizeEx();

instead of the older Py_Finalize() when you need an error status.

During finalization:

modules are torn down
object destructors may run
weakrefs may fire
atexit handlers may run
daemon threads may be stopped
imports may no longer work normally

Host applications should avoid calling arbitrary Python code late in shutdown unless the runtime is still known to be valid.

69.20 Reinitialization

Historically, code sometimes called:

Py_Initialize
Py_Finalize
Py_Initialize again

This pattern is fragile. Not all extension modules and process-global runtime state behave correctly after complete finalization and reinitialization.

Long-running host applications should usually initialize Python once and keep it alive until process shutdown.

Preferred model:

application start
    initialize Python once
application lifetime
    run scripts and plugins
application shutdown
    finalize Python once

69.21 Security Boundaries

Embedded Python is not a secure sandbox by default.

Python code can often access:

filesystem
network
process environment
loaded modules
native extension modules
introspection
resource exhaustion paths

Removing names from builtins or limiting sys.path improves policy control but does not create a complete sandbox.

For hostile or untrusted code, use:

separate processes
OS permissions
containers
seccomp or sandboxing primitives
resource limits
restricted filesystem views

Embedding is an integration mechanism, not a security boundary.

69.22 Packaging an Embedded Runtime

Shipping embedded Python requires bundling the correct runtime pieces.

Typical components:

Python shared library or linked runtime
standard library
encodings package
extension modules
application Python modules
native dependency libraries
site configuration
import path setup

The encodings package is especially important. If CPython cannot import encodings during startup, initialization fails early.

Packaging failures usually appear as startup or import errors, not compiler errors.

69.23 Embedding with C++

The CPython API is C. C++ applications can use it directly, but must respect C API rules.

Important concerns:

do not let C++ exceptions cross C API boundaries
wrap PyObject * in RAII carefully
release references deterministically
translate Python errors to C++ errors explicitly
manage GIL with RAII guards

Example RAII wrapper idea:

class PyRef {
public:
    explicit PyRef(PyObject *p = nullptr) : p_(p) {}
    ~PyRef() { Py_XDECREF(p_); }

    PyObject *get() const { return p_; }
    PyObject *release() {
        PyObject *p = p_;
        p_ = nullptr;
        return p;
    }

private:
    PyObject *p_;
};

This can reduce leaks, but it must be designed around CPython ownership rules.

69.24 Common Embedding Bugs

BugCause
ModuleNotFoundError: encodingsBad standard library path
Importing wrong modulesUncontrolled sys.path
Crashes from native threadsMissing GIL acquisition
Leaked objectsMissing Py_DECREF
Shutdown crashesPython called during finalization
Reinitialization failureExtension global state not reset
Plugin state leaksStatic globals instead of module state
Security bypassTreating embedding as sandboxing
DeadlocksHost locks combined with GIL acquisition
Broken callbacksHost object destroyed before Python wrapper

Most embedding problems are lifecycle problems: startup, import configuration, ownership, threading, and shutdown.

69.25 Practical Design Guidelines

For robust embedding:

AreaGuideline
StartupUse PyConfig rather than ambient defaults
PathsSet import paths explicitly
EnvironmentUse isolated mode when shipping an app runtime
LifetimeInitialize once, finalize once
ThreadsAcquire GIL before every Python API call
Host APIsExpose a small built-in module
ObjectsDefine clear native ownership rules
ErrorsConvert Python exceptions into host diagnostics
SecurityUse process isolation for untrusted code
ShutdownAvoid late Python calls during teardown

69.26 Chapter Summary

Embedding Python places the CPython runtime inside a native host application. The host initializes the interpreter, configures paths and runtime options, imports modules, calls Python functions, exposes native services, handles exceptions, and shuts the runtime down.

Embedding uses the same C API as extension modules, but control flow is reversed. C calls Python instead of Python calling C.

A reliable embedded runtime requires careful startup configuration, explicit import paths, strict reference ownership, GIL management, clear host object lifetime, and conservative shutdown behavior.