Skip to content

57. `importlib`

importlib.machinery finders and loaders, importlib.util helpers, and the Python-level import bootstrap.

The importlib module exposes Python’s import system as ordinary Python APIs. It is both a library for importing modules programmatically and a reference implementation for much of the import machinery.

The import system is one of CPython’s central runtime subsystems. Every import statement passes through machinery that checks module caches, resolves names, searches import paths, selects finders, builds module specifications, invokes loaders, initializes module objects, and records the result in sys.modules.

57.1 The Role of importlib

importlib provides the programmable interface to imports.

Example:

import importlib

math = importlib.import_module("math")
print(math.sqrt(9))

This is roughly equivalent to:

import math

but the module name can be computed dynamically.

Common uses include:

Use caseExample
Dynamic plugin loadingLoad modules by string name
Framework discoveryImport views, handlers, commands, models
Test toolingReload modified modules
Custom import systemsAdd finders and loaders
Package metadata toolsLocate module specs and resources
EmbeddingInitialize imports under controlled configuration

The import system is not just file loading. It is a protocol.

57.2 Import Statement to Runtime Machinery

A statement such as:

import json

starts a multi-step runtime operation.

Conceptually:

import statement
__import__ builtin
importlib machinery
sys.modules cache check
sys.meta_path finders
module spec
loader
module object
sys.modules["json"]

importlib exposes several of these layers directly.

The high-level entry point is:

importlib.import_module(name, package=None)

The lower-level machinery is implemented across modules such as:

importlib
importlib.util
importlib.machinery
importlib.abc
importlib.resources

57.3 sys.modules

The first major import structure is sys.modules.

It is the module cache.

import sys
import json

print(sys.modules["json"])

Before loading a module, the import system checks whether it already exists in sys.modules.

Simplified:

if fullname in sys.modules:
    return sys.modules[fullname]

This cache has three major purposes.

First, it prevents duplicate module execution.

Second, it preserves module identity.

Third, it makes circular imports possible.

During import, CPython inserts a module into sys.modules before executing its body. If another module imports it during that execution, it receives the partially initialized module.

That behavior explains circular import errors such as:

cannot import name 'x' from partially initialized module

The module object exists, but its top-level code has not finished.

57.4 Module Objects

An imported module is a normal Python object.

import types
import json

print(isinstance(json, types.ModuleType))
print(json.__dict__)

A module stores its global variables in __dict__.

Common module attributes include:

AttributeMeaning
__name__Fully qualified module name
__dict__Module global namespace
__package__Package context for relative imports
__spec__Module specification
__loader__Loader that loaded the module
__file__Source or binary path, when available
__cached__Bytecode cache path, when available
__path__Package search path, for packages

Module execution means executing code with the module dictionary as the global namespace.

Conceptually:

module source code
compile to code object
execute code object with module.__dict__

57.5 Module Specifications

The import system uses module specs to describe how a module should be loaded.

Specs are represented by importlib.machinery.ModuleSpec.

Example:

import importlib.util

spec = importlib.util.find_spec("json")

print(spec.name)
print(spec.loader)
print(spec.origin)
print(spec.submodule_search_locations)

A module spec contains:

FieldMeaning
nameFully qualified module name
loaderLoader object
originSource of the module
submodule_search_locationsPackage search locations
cachedBytecode cache path
parentParent package name
has_locationWhether the module has a filesystem location

The spec separates discovery from execution.

finder
returns spec
loader uses spec
module is created and executed

This design lets the same import protocol support many sources:

source files
bytecode files
extension modules
built-in modules
frozen modules
zip archives
namespace packages
custom virtual modules

57.6 Finders

A finder locates a module.

Finders live mainly on sys.meta_path.

import sys

for finder in sys.meta_path:
    print(finder)

Each finder may implement:

find_spec(fullname, path=None, target=None)

The import system asks each finder whether it can find the requested module.

Conceptually:

for finder in sys.meta_path:
    spec = finder.find_spec(fullname, path, target)
    if spec is not None:
        use spec

Common finder types include:

FinderRole
Built-in importerFinds built-in modules
Frozen importerFinds frozen modules
Path finderSearches sys.path and package paths
Custom finderUser-provided import behavior

Finders answer the question: “Can this module be found, and how should it be loaded?”

57.7 Loaders

A loader creates and executes a module.

Loader responsibilities may include:

create module object
read source or binary data
compile source to code object
execute code object
initialize extension module
set module attributes

Modern loaders generally implement:

create_module(spec)
exec_module(module)

Conceptually:

spec.loader.create_module(spec)
module object

spec.loader.exec_module(module)
initialized module

If create_module() returns None, the import machinery creates a default module object.

exec_module() performs the actual initialization.

For a source module, this usually means:

read .py file
compile source
execute code in module namespace

For an extension module, this means invoking native initialization code.

57.8 PathFinder

PathFinder is the main finder for ordinary filesystem imports.

It searches path entries from either:

sys.path

or a package’s:

__path__

Example:

import importlib.machinery

print(importlib.machinery.PathFinder)

Resolution differs by context.

For top-level import:

import json

the search path is:

sys.path

For submodule import:

import package.module

the search path is:

package.__path__

This is why packages control where their submodules are found.

57.9 Path Hooks and Importer Cache

Filesystem path entries are not interpreted directly every time.

CPython uses:

sys.path_hooks
sys.path_importer_cache

sys.path_hooks contains callables that know how to build importers for path entries.

sys.path_importer_cache caches the result.

Conceptually:

path entry
path hook creates path entry finder
cache finder in sys.path_importer_cache
reuse finder for later imports

This supports path entries such as:

directories
zip files
custom virtual paths

The cache avoids rebuilding path entry finders repeatedly.

57.10 Source File Loading

A normal .py import uses a source file loader.

Conceptual flow:

find module.py
create module spec
create module object
read source
compile source to code object
execute code object in module namespace
store module in sys.modules

Example file:

# config.py
host = "localhost"
port = 5432

After import:

import config

print(config.host)
print(config.port)

The top-level assignments ran during module execution and populated config.__dict__.

57.11 Bytecode Caches

CPython may cache compiled bytecode in __pycache__.

Example:

module.py
__pycache__/module.cpython-313.pyc

A .pyc file stores compiled code plus validation metadata.

The cache avoids recompiling unchanged source on later imports.

Important point: .pyc is a cache, not the semantic source of truth for ordinary source imports.

Conceptually:

if valid pyc exists:
    load code object from pyc
else:
    read source
    compile source
    write pyc if allowed

importlib contains the machinery for cache path calculation, validation, and bytecode loading.

57.12 Packages

A package is a module that can contain submodules.

Traditionally, a package is a directory with:

package/
    __init__.py
    module.py

Importing the package executes __init__.py.

import package

Importing a submodule searches the package path:

import package.module

Package objects have:

package.__path__

This attribute tells the import system where to look for submodules.

Conceptually:

package.__path__
submodule search locations

57.13 Namespace Packages

Namespace packages allow one package name to span multiple directories.

They do not require an __init__.py.

Example:

dir1/acme/plugins/a.py
dir2/acme/plugins/b.py

If both dir1 and dir2 are on sys.path, acme.plugins can include both locations.

A namespace package’s __path__ contains multiple entries.

This feature is useful for plugin systems and separately distributed package portions.

57.14 Relative Imports

Relative imports depend on package context.

Example:

from . import util
from .models import User
from ..core import config

The import system uses __package__ and __spec__.parent to resolve these names.

A relative import cannot be resolved from just the text .models. The import system needs to know the current package.

Conceptually:

current package: app.views
relative import: .models
resolved name: app.views.models

This is why running package files directly can break relative imports. The module may lack correct package context.

57.15 importlib.import_module()

importlib.import_module() is the public dynamic import API.

import importlib

name = "json.decoder"
mod = importlib.import_module(name)

print(mod)

It handles dotted names and package-relative imports.

Example:

importlib.import_module(".decoder", package="json")

This resolves relative to json.

Unlike __import__(), import_module() returns the requested module rather than the top-level package.

mod = importlib.import_module("json.decoder")
print(mod.__name__)

Output:

json.decoder

57.16 Reloading Modules

importlib.reload(module) re-executes a module.

import importlib
import config

importlib.reload(config)

Reloading keeps the same module object but re-executes its code.

Conceptually:

existing module object
reuse module.__dict__
execute module code again

This can produce surprising behavior.

Old names may remain if the new source no longer defines them. Existing references held elsewhere still point to old objects.

Example:

from config import SETTINGS

importlib.reload(config)

The name SETTINGS in the importing module is not automatically rebound.

Reload is useful for development tools and REPL workflows, but it is rarely clean enough for production hot reload without careful design.

57.17 Custom Importers

The import system is extensible.

A custom finder and loader can load modules from unusual sources.

Minimal sketch:

import importlib.abc
import importlib.util
import sys
import types

class MemoryLoader(importlib.abc.Loader):
    def create_module(self, spec):
        return None

    def exec_module(self, module):
        module.answer = 42

class MemoryFinder(importlib.abc.MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        if fullname == "virtual_config":
            return importlib.util.spec_from_loader(fullname, MemoryLoader())
        return None

sys.meta_path.insert(0, MemoryFinder())

import virtual_config
print(virtual_config.answer)

Output:

42

This shows that import does not require a file.

A module can come from memory, a database, a network service, a generated source string, or an embedded resource, provided a finder and loader implement the protocol.

57.18 Import Locks

Imports are synchronized.

CPython uses import locks to prevent multiple threads from initializing the same module at the same time.

Without locking, two threads could both observe a missing module, both create module objects, and both execute module code.

The practical guarantee is that module initialization is coordinated per interpreter.

This matters for multi-threaded programs that import modules lazily.

Import-time side effects should still be minimized, because import execution can block other imports and create ordering hazards.

57.19 Import-Time Execution

Importing a module executes its top-level code.

Example:

# app.py
print("loading app")

value = 42

Then:

import app

prints:

loading app

This is why module top-level code should usually define names rather than perform expensive or irreversible actions.

Prefer:

def main():
    ...

if __name__ == "__main__":
    main()

The import system is an execution system, not just a name lookup system.

57.20 Built-in and Frozen Modules

Not all modules come from files.

Built-in modules are compiled into CPython.

Frozen modules are stored as frozen code data inside the interpreter.

Examples:

import sys
import importlib.machinery

print(sys.builtin_module_names)
print(importlib.machinery.BuiltinImporter)
print(importlib.machinery.FrozenImporter)

Built-in and frozen importers are usually present on sys.meta_path.

They allow the interpreter to import critical modules before filesystem import is fully available.

This is important during startup.

57.21 Extension Modules

Native extension modules are shared libraries loaded by CPython.

Examples:

_module.cpython-313-x86_64-linux-gnu.so
_module.pyd

The import system finds the shared library, loads it, and calls its initialization function.

Conceptually:

find shared library
dynamic loader opens binary
CPython calls PyInit_modulename
module object returned or initialized

Extension module loading connects importlib to the C API and platform dynamic linking.

57.22 importlib.util

importlib.util provides helper functions.

Common APIs include:

APIPurpose
find_spec()Find a module spec
module_from_spec()Create module from spec
spec_from_file_location()Build spec for a file
cache_from_source()Compute .pyc path
source_from_cache()Recover source path from cache path

Manual file import example:

import importlib.util
import sys

path = "/tmp/plugin.py"
name = "plugin"

spec = importlib.util.spec_from_file_location(name, path)
module = importlib.util.module_from_spec(spec)

sys.modules[name] = module
spec.loader.exec_module(module)

This is the lower-level form of import. It gives the caller control over the module name, path, cache behavior, and registration.

57.23 importlib.machinery

importlib.machinery exposes concrete importer machinery.

Important objects include:

ObjectRole
PathFinderMain path-based finder
FileFinderFinder for filesystem directories
SourceFileLoaderLoads .py files
SourcelessFileLoaderLoads .pyc files
ExtensionFileLoaderLoads native extension modules
BuiltinImporterLoads built-in modules
FrozenImporterLoads frozen modules
ModuleSpecImport specification object

This module is useful when building custom import behavior that still wants to reuse CPython’s standard components.

57.24 importlib.abc

importlib.abc defines abstract base classes for import protocols.

Important classes include:

ABCMeaning
MetaPathFinderFinder on sys.meta_path
PathEntryFinderFinder for one path entry
LoaderBase loader protocol
ResourceReaderLegacy resource reading protocol
InspectLoaderLoader that can inspect code
ExecutionLoaderLoader that can execute code
SourceLoaderLoader for source code

These ABCs document expected methods and support structured custom importers.

57.25 Resources

importlib.resources provides access to package data.

Example:

from importlib import resources

text = resources.files("my_package").joinpath("data.txt").read_text()

This works for packages stored in places other than normal directories, such as zip files, as long as the loader supports the resource interface.

This avoids fragile code like:

open(os.path.join(os.path.dirname(__file__), "data.txt"))

Package resources should be accessed through import-aware APIs when possible.

57.26 Invalidation Caches

Finders may cache directory listings or module discovery information.

importlib.invalidate_caches() asks import finders to clear those caches.

import importlib

importlib.invalidate_caches()

This is useful when a program creates new modules on disk at runtime and then wants to import them.

Example:

write plugin.py
invalidate import caches
import plugin

Without invalidation, a path finder may not notice the new file immediately.

57.27 Common Import Failure Modes

Import failures often come from path, naming, or initialization problems.

SymptomCommon cause
ModuleNotFoundErrorNo finder found a spec
ImportErrorLoader found module but failed to load requested object
Partially initialized moduleCircular import
Relative import errorMissing package context
Wrong module importedUnexpected earlier sys.path entry
Reload does not update referencesExisting names still point to old objects
Package data missingFiles accessed outside import resource APIs

A useful diagnostic sequence:

import importlib.util
import sys

print(sys.path)
print(importlib.util.find_spec("some_module"))

If find_spec() returns None, discovery failed. If it returns a spec but import fails, loading or execution failed.

57.28 Relationship to sys

The import system depends heavily on sys.

Important sys objects include:

ObjectImport role
sys.modulesModule cache
sys.pathTop-level search path
sys.meta_pathMeta path finders
sys.path_hooksPath entry importer factories
sys.path_importer_cacheCached path entry finders
sys.builtin_module_namesBuilt-in module names

importlib provides machinery. sys stores global interpreter import state.

57.29 Relationship to Code Objects

For source imports, importlib ultimately creates and executes code objects.

read source
compile(source, filename, "exec")
code object
exec(code, module.__dict__)

This connects import machinery to the compiler and interpreter.

A source module import is equivalent in broad shape to:

module = types.ModuleType(name)
code = compile(source, filename, "exec")
exec(code, module.__dict__)

The real implementation includes specs, loaders, caches, locks, errors, packages, and edge cases.

57.30 Why importlib Matters for CPython Internals

importlib matters because imports are central to Python execution.

Almost every Python program starts by importing modules. CPython itself depends on imports during startup. Tooling, packaging, virtual environments, plugins, test runners, and application frameworks all rely on import behavior.

Understanding importlib explains:

why sys.modules exists
why circular imports happen
how packages find submodules
how .py files become module objects
how bytecode caches are used
how extension modules are initialized
how custom import systems work
why import-time side effects matter

It also shows CPython’s design preference for exposing internal protocols as Python-level machinery. Imports are not hard-coded as one filesystem algorithm. They are a layered protocol built from finders, loaders, specs, caches, and module objects.

57.31 Chapter Summary

The importlib module is the Python-level interface to CPython’s import system. It exposes dynamic imports, module specifications, finders, loaders, path-based search, bytecode caches, package resources, reload behavior, and custom importer protocols.

For CPython internals, importlib is important because it connects source files, module objects, code objects, sys.modules, sys.path, extension modules, built-in modules, packages, and interpreter startup into one coherent subsystem.