importlib.machinery finders and loaders, importlib.util helpers, and the Python-level import bootstrap.
The importlib module exposes Python’s import system as ordinary Python APIs. It is both a library for importing modules programmatically and a reference implementation for much of the import machinery.
The import system is one of CPython’s central runtime subsystems. Every import statement passes through machinery that checks module caches, resolves names, searches import paths, selects finders, builds module specifications, invokes loaders, initializes module objects, and records the result in sys.modules.
57.1 The Role of importlib
importlib provides the programmable interface to imports.
Example:
import importlib
math = importlib.import_module("math")
print(math.sqrt(9))This is roughly equivalent to:
import mathbut the module name can be computed dynamically.
Common uses include:
| Use case | Example |
|---|---|
| Dynamic plugin loading | Load modules by string name |
| Framework discovery | Import views, handlers, commands, models |
| Test tooling | Reload modified modules |
| Custom import systems | Add finders and loaders |
| Package metadata tools | Locate module specs and resources |
| Embedding | Initialize imports under controlled configuration |
The import system is not just file loading. It is a protocol.
57.2 Import Statement to Runtime Machinery
A statement such as:
import jsonstarts a multi-step runtime operation.
Conceptually:
import statement
↓
__import__ builtin
↓
importlib machinery
↓
sys.modules cache check
↓
sys.meta_path finders
↓
module spec
↓
loader
↓
module object
↓
sys.modules["json"]importlib exposes several of these layers directly.
The high-level entry point is:
importlib.import_module(name, package=None)The lower-level machinery is implemented across modules such as:
importlib
importlib.util
importlib.machinery
importlib.abc
importlib.resources57.3 sys.modules
The first major import structure is sys.modules.
It is the module cache.
import sys
import json
print(sys.modules["json"])Before loading a module, the import system checks whether it already exists in sys.modules.
Simplified:
if fullname in sys.modules:
return sys.modules[fullname]This cache has three major purposes.
First, it prevents duplicate module execution.
Second, it preserves module identity.
Third, it makes circular imports possible.
During import, CPython inserts a module into sys.modules before executing its body. If another module imports it during that execution, it receives the partially initialized module.
That behavior explains circular import errors such as:
cannot import name 'x' from partially initialized moduleThe module object exists, but its top-level code has not finished.
57.4 Module Objects
An imported module is a normal Python object.
import types
import json
print(isinstance(json, types.ModuleType))
print(json.__dict__)A module stores its global variables in __dict__.
Common module attributes include:
| Attribute | Meaning |
|---|---|
__name__ | Fully qualified module name |
__dict__ | Module global namespace |
__package__ | Package context for relative imports |
__spec__ | Module specification |
__loader__ | Loader that loaded the module |
__file__ | Source or binary path, when available |
__cached__ | Bytecode cache path, when available |
__path__ | Package search path, for packages |
Module execution means executing code with the module dictionary as the global namespace.
Conceptually:
module source code
↓
compile to code object
↓
execute code object with module.__dict__57.5 Module Specifications
The import system uses module specs to describe how a module should be loaded.
Specs are represented by importlib.machinery.ModuleSpec.
Example:
import importlib.util
spec = importlib.util.find_spec("json")
print(spec.name)
print(spec.loader)
print(spec.origin)
print(spec.submodule_search_locations)A module spec contains:
| Field | Meaning |
|---|---|
name | Fully qualified module name |
loader | Loader object |
origin | Source of the module |
submodule_search_locations | Package search locations |
cached | Bytecode cache path |
parent | Parent package name |
has_location | Whether the module has a filesystem location |
The spec separates discovery from execution.
finder
↓
returns spec
↓
loader uses spec
↓
module is created and executedThis design lets the same import protocol support many sources:
source files
bytecode files
extension modules
built-in modules
frozen modules
zip archives
namespace packages
custom virtual modules57.6 Finders
A finder locates a module.
Finders live mainly on sys.meta_path.
import sys
for finder in sys.meta_path:
print(finder)Each finder may implement:
find_spec(fullname, path=None, target=None)The import system asks each finder whether it can find the requested module.
Conceptually:
for finder in sys.meta_path:
spec = finder.find_spec(fullname, path, target)
if spec is not None:
use specCommon finder types include:
| Finder | Role |
|---|---|
| Built-in importer | Finds built-in modules |
| Frozen importer | Finds frozen modules |
| Path finder | Searches sys.path and package paths |
| Custom finder | User-provided import behavior |
Finders answer the question: “Can this module be found, and how should it be loaded?”
57.7 Loaders
A loader creates and executes a module.
Loader responsibilities may include:
create module object
read source or binary data
compile source to code object
execute code object
initialize extension module
set module attributesModern loaders generally implement:
create_module(spec)
exec_module(module)Conceptually:
spec.loader.create_module(spec)
↓
module object
spec.loader.exec_module(module)
↓
initialized moduleIf create_module() returns None, the import machinery creates a default module object.
exec_module() performs the actual initialization.
For a source module, this usually means:
read .py file
compile source
execute code in module namespaceFor an extension module, this means invoking native initialization code.
57.8 PathFinder
PathFinder is the main finder for ordinary filesystem imports.
It searches path entries from either:
sys.pathor a package’s:
__path__Example:
import importlib.machinery
print(importlib.machinery.PathFinder)Resolution differs by context.
For top-level import:
import jsonthe search path is:
sys.pathFor submodule import:
import package.modulethe search path is:
package.__path__This is why packages control where their submodules are found.
57.9 Path Hooks and Importer Cache
Filesystem path entries are not interpreted directly every time.
CPython uses:
sys.path_hooks
sys.path_importer_cachesys.path_hooks contains callables that know how to build importers for path entries.
sys.path_importer_cache caches the result.
Conceptually:
path entry
↓
path hook creates path entry finder
↓
cache finder in sys.path_importer_cache
↓
reuse finder for later importsThis supports path entries such as:
directories
zip files
custom virtual pathsThe cache avoids rebuilding path entry finders repeatedly.
57.10 Source File Loading
A normal .py import uses a source file loader.
Conceptual flow:
find module.py
↓
create module spec
↓
create module object
↓
read source
↓
compile source to code object
↓
execute code object in module namespace
↓
store module in sys.modulesExample file:
# config.py
host = "localhost"
port = 5432After import:
import config
print(config.host)
print(config.port)The top-level assignments ran during module execution and populated config.__dict__.
57.11 Bytecode Caches
CPython may cache compiled bytecode in __pycache__.
Example:
module.py
__pycache__/module.cpython-313.pycA .pyc file stores compiled code plus validation metadata.
The cache avoids recompiling unchanged source on later imports.
Important point: .pyc is a cache, not the semantic source of truth for ordinary source imports.
Conceptually:
if valid pyc exists:
load code object from pyc
else:
read source
compile source
write pyc if allowedimportlib contains the machinery for cache path calculation, validation, and bytecode loading.
57.12 Packages
A package is a module that can contain submodules.
Traditionally, a package is a directory with:
package/
__init__.py
module.pyImporting the package executes __init__.py.
import packageImporting a submodule searches the package path:
import package.modulePackage objects have:
package.__path__This attribute tells the import system where to look for submodules.
Conceptually:
package.__path__
↓
submodule search locations57.13 Namespace Packages
Namespace packages allow one package name to span multiple directories.
They do not require an __init__.py.
Example:
dir1/acme/plugins/a.py
dir2/acme/plugins/b.pyIf both dir1 and dir2 are on sys.path, acme.plugins can include both locations.
A namespace package’s __path__ contains multiple entries.
This feature is useful for plugin systems and separately distributed package portions.
57.14 Relative Imports
Relative imports depend on package context.
Example:
from . import util
from .models import User
from ..core import configThe import system uses __package__ and __spec__.parent to resolve these names.
A relative import cannot be resolved from just the text .models. The import system needs to know the current package.
Conceptually:
current package: app.views
relative import: .models
resolved name: app.views.modelsThis is why running package files directly can break relative imports. The module may lack correct package context.
57.15 importlib.import_module()
importlib.import_module() is the public dynamic import API.
import importlib
name = "json.decoder"
mod = importlib.import_module(name)
print(mod)It handles dotted names and package-relative imports.
Example:
importlib.import_module(".decoder", package="json")This resolves relative to json.
Unlike __import__(), import_module() returns the requested module rather than the top-level package.
mod = importlib.import_module("json.decoder")
print(mod.__name__)Output:
json.decoder57.16 Reloading Modules
importlib.reload(module) re-executes a module.
import importlib
import config
importlib.reload(config)Reloading keeps the same module object but re-executes its code.
Conceptually:
existing module object
↓
reuse module.__dict__
↓
execute module code againThis can produce surprising behavior.
Old names may remain if the new source no longer defines them. Existing references held elsewhere still point to old objects.
Example:
from config import SETTINGS
importlib.reload(config)The name SETTINGS in the importing module is not automatically rebound.
Reload is useful for development tools and REPL workflows, but it is rarely clean enough for production hot reload without careful design.
57.17 Custom Importers
The import system is extensible.
A custom finder and loader can load modules from unusual sources.
Minimal sketch:
import importlib.abc
import importlib.util
import sys
import types
class MemoryLoader(importlib.abc.Loader):
def create_module(self, spec):
return None
def exec_module(self, module):
module.answer = 42
class MemoryFinder(importlib.abc.MetaPathFinder):
def find_spec(self, fullname, path, target=None):
if fullname == "virtual_config":
return importlib.util.spec_from_loader(fullname, MemoryLoader())
return None
sys.meta_path.insert(0, MemoryFinder())
import virtual_config
print(virtual_config.answer)Output:
42This shows that import does not require a file.
A module can come from memory, a database, a network service, a generated source string, or an embedded resource, provided a finder and loader implement the protocol.
57.18 Import Locks
Imports are synchronized.
CPython uses import locks to prevent multiple threads from initializing the same module at the same time.
Without locking, two threads could both observe a missing module, both create module objects, and both execute module code.
The practical guarantee is that module initialization is coordinated per interpreter.
This matters for multi-threaded programs that import modules lazily.
Import-time side effects should still be minimized, because import execution can block other imports and create ordering hazards.
57.19 Import-Time Execution
Importing a module executes its top-level code.
Example:
# app.py
print("loading app")
value = 42Then:
import appprints:
loading appThis is why module top-level code should usually define names rather than perform expensive or irreversible actions.
Prefer:
def main():
...
if __name__ == "__main__":
main()The import system is an execution system, not just a name lookup system.
57.20 Built-in and Frozen Modules
Not all modules come from files.
Built-in modules are compiled into CPython.
Frozen modules are stored as frozen code data inside the interpreter.
Examples:
import sys
import importlib.machinery
print(sys.builtin_module_names)
print(importlib.machinery.BuiltinImporter)
print(importlib.machinery.FrozenImporter)Built-in and frozen importers are usually present on sys.meta_path.
They allow the interpreter to import critical modules before filesystem import is fully available.
This is important during startup.
57.21 Extension Modules
Native extension modules are shared libraries loaded by CPython.
Examples:
_module.cpython-313-x86_64-linux-gnu.so
_module.pydThe import system finds the shared library, loads it, and calls its initialization function.
Conceptually:
find shared library
↓
dynamic loader opens binary
↓
CPython calls PyInit_modulename
↓
module object returned or initializedExtension module loading connects importlib to the C API and platform dynamic linking.
57.22 importlib.util
importlib.util provides helper functions.
Common APIs include:
| API | Purpose |
|---|---|
find_spec() | Find a module spec |
module_from_spec() | Create module from spec |
spec_from_file_location() | Build spec for a file |
cache_from_source() | Compute .pyc path |
source_from_cache() | Recover source path from cache path |
Manual file import example:
import importlib.util
import sys
path = "/tmp/plugin.py"
name = "plugin"
spec = importlib.util.spec_from_file_location(name, path)
module = importlib.util.module_from_spec(spec)
sys.modules[name] = module
spec.loader.exec_module(module)This is the lower-level form of import. It gives the caller control over the module name, path, cache behavior, and registration.
57.23 importlib.machinery
importlib.machinery exposes concrete importer machinery.
Important objects include:
| Object | Role |
|---|---|
PathFinder | Main path-based finder |
FileFinder | Finder for filesystem directories |
SourceFileLoader | Loads .py files |
SourcelessFileLoader | Loads .pyc files |
ExtensionFileLoader | Loads native extension modules |
BuiltinImporter | Loads built-in modules |
FrozenImporter | Loads frozen modules |
ModuleSpec | Import specification object |
This module is useful when building custom import behavior that still wants to reuse CPython’s standard components.
57.24 importlib.abc
importlib.abc defines abstract base classes for import protocols.
Important classes include:
| ABC | Meaning |
|---|---|
MetaPathFinder | Finder on sys.meta_path |
PathEntryFinder | Finder for one path entry |
Loader | Base loader protocol |
ResourceReader | Legacy resource reading protocol |
InspectLoader | Loader that can inspect code |
ExecutionLoader | Loader that can execute code |
SourceLoader | Loader for source code |
These ABCs document expected methods and support structured custom importers.
57.25 Resources
importlib.resources provides access to package data.
Example:
from importlib import resources
text = resources.files("my_package").joinpath("data.txt").read_text()This works for packages stored in places other than normal directories, such as zip files, as long as the loader supports the resource interface.
This avoids fragile code like:
open(os.path.join(os.path.dirname(__file__), "data.txt"))Package resources should be accessed through import-aware APIs when possible.
57.26 Invalidation Caches
Finders may cache directory listings or module discovery information.
importlib.invalidate_caches() asks import finders to clear those caches.
import importlib
importlib.invalidate_caches()This is useful when a program creates new modules on disk at runtime and then wants to import them.
Example:
write plugin.py
↓
invalidate import caches
↓
import pluginWithout invalidation, a path finder may not notice the new file immediately.
57.27 Common Import Failure Modes
Import failures often come from path, naming, or initialization problems.
| Symptom | Common cause |
|---|---|
ModuleNotFoundError | No finder found a spec |
ImportError | Loader found module but failed to load requested object |
| Partially initialized module | Circular import |
| Relative import error | Missing package context |
| Wrong module imported | Unexpected earlier sys.path entry |
| Reload does not update references | Existing names still point to old objects |
| Package data missing | Files accessed outside import resource APIs |
A useful diagnostic sequence:
import importlib.util
import sys
print(sys.path)
print(importlib.util.find_spec("some_module"))If find_spec() returns None, discovery failed. If it returns a spec but import fails, loading or execution failed.
57.28 Relationship to sys
The import system depends heavily on sys.
Important sys objects include:
| Object | Import role |
|---|---|
sys.modules | Module cache |
sys.path | Top-level search path |
sys.meta_path | Meta path finders |
sys.path_hooks | Path entry importer factories |
sys.path_importer_cache | Cached path entry finders |
sys.builtin_module_names | Built-in module names |
importlib provides machinery. sys stores global interpreter import state.
57.29 Relationship to Code Objects
For source imports, importlib ultimately creates and executes code objects.
read source
↓
compile(source, filename, "exec")
↓
code object
↓
exec(code, module.__dict__)This connects import machinery to the compiler and interpreter.
A source module import is equivalent in broad shape to:
module = types.ModuleType(name)
code = compile(source, filename, "exec")
exec(code, module.__dict__)The real implementation includes specs, loaders, caches, locks, errors, packages, and edge cases.
57.30 Why importlib Matters for CPython Internals
importlib matters because imports are central to Python execution.
Almost every Python program starts by importing modules. CPython itself depends on imports during startup. Tooling, packaging, virtual environments, plugins, test runners, and application frameworks all rely on import behavior.
Understanding importlib explains:
why sys.modules exists
why circular imports happen
how packages find submodules
how .py files become module objects
how bytecode caches are used
how extension modules are initialized
how custom import systems work
why import-time side effects matterIt also shows CPython’s design preference for exposing internal protocols as Python-level machinery. Imports are not hard-coded as one filesystem algorithm. They are a layered protocol built from finders, loaders, specs, caches, and module objects.
57.31 Chapter Summary
The importlib module is the Python-level interface to CPython’s import system. It exposes dynamic imports, module specifications, finders, loaders, path-based search, bytecode caches, package resources, reload behavior, and custom importer protocols.
For CPython internals, importlib is important because it connects source files, module objects, code objects, sys.modules, sys.path, extension modules, built-in modules, packages, and interpreter startup into one coherent subsystem.