# 57. `importlib`

# 57. `importlib`

The `importlib` module exposes Python’s import system as ordinary Python APIs. It is both a library for importing modules programmatically and a reference implementation for much of the import machinery.

The import system is one of CPython’s central runtime subsystems. Every `import` statement passes through machinery that checks module caches, resolves names, searches import paths, selects finders, builds module specifications, invokes loaders, initializes module objects, and records the result in `sys.modules`.

## 57.1 The Role of `importlib`

`importlib` provides the programmable interface to imports.

Example:

```python
import importlib

math = importlib.import_module("math")
print(math.sqrt(9))
```

This is roughly equivalent to:

```python
import math
```

but the module name can be computed dynamically.

Common uses include:

| Use case | Example |
|---|---|
| Dynamic plugin loading | Load modules by string name |
| Framework discovery | Import views, handlers, commands, models |
| Test tooling | Reload modified modules |
| Custom import systems | Add finders and loaders |
| Package metadata tools | Locate module specs and resources |
| Embedding | Initialize imports under controlled configuration |

The import system is not just file loading. It is a protocol.

## 57.2 Import Statement to Runtime Machinery

A statement such as:

```python
import json
```

starts a multi-step runtime operation.

Conceptually:

```text
import statement
    ↓
__import__ builtin
    ↓
importlib machinery
    ↓
sys.modules cache check
    ↓
sys.meta_path finders
    ↓
module spec
    ↓
loader
    ↓
module object
    ↓
sys.modules["json"]
```

`importlib` exposes several of these layers directly.

The high-level entry point is:

```python
importlib.import_module(name, package=None)
```

The lower-level machinery is implemented across modules such as:

```text
importlib
importlib.util
importlib.machinery
importlib.abc
importlib.resources
```

## 57.3 `sys.modules`

The first major import structure is `sys.modules`.

It is the module cache.

```python
import sys
import json

print(sys.modules["json"])
```

Before loading a module, the import system checks whether it already exists in `sys.modules`.

Simplified:

```python
if fullname in sys.modules:
    return sys.modules[fullname]
```

This cache has three major purposes.

First, it prevents duplicate module execution.

Second, it preserves module identity.

Third, it makes circular imports possible.

During import, CPython inserts a module into `sys.modules` before executing its body. If another module imports it during that execution, it receives the partially initialized module.

That behavior explains circular import errors such as:

```text
cannot import name 'x' from partially initialized module
```

The module object exists, but its top-level code has not finished.

## 57.4 Module Objects

An imported module is a normal Python object.

```python
import types
import json

print(isinstance(json, types.ModuleType))
print(json.__dict__)
```

A module stores its global variables in `__dict__`.

Common module attributes include:

| Attribute | Meaning |
|---|---|
| `__name__` | Fully qualified module name |
| `__dict__` | Module global namespace |
| `__package__` | Package context for relative imports |
| `__spec__` | Module specification |
| `__loader__` | Loader that loaded the module |
| `__file__` | Source or binary path, when available |
| `__cached__` | Bytecode cache path, when available |
| `__path__` | Package search path, for packages |

Module execution means executing code with the module dictionary as the global namespace.

Conceptually:

```text
module source code
    ↓
compile to code object
    ↓
execute code object with module.__dict__
```

## 57.5 Module Specifications

The import system uses module specs to describe how a module should be loaded.

Specs are represented by `importlib.machinery.ModuleSpec`.

Example:

```python
import importlib.util

spec = importlib.util.find_spec("json")

print(spec.name)
print(spec.loader)
print(spec.origin)
print(spec.submodule_search_locations)
```

A module spec contains:

| Field | Meaning |
|---|---|
| `name` | Fully qualified module name |
| `loader` | Loader object |
| `origin` | Source of the module |
| `submodule_search_locations` | Package search locations |
| `cached` | Bytecode cache path |
| `parent` | Parent package name |
| `has_location` | Whether the module has a filesystem location |

The spec separates discovery from execution.

```text
finder
    ↓
returns spec
    ↓
loader uses spec
    ↓
module is created and executed
```

This design lets the same import protocol support many sources:

```text
source files
bytecode files
extension modules
built-in modules
frozen modules
zip archives
namespace packages
custom virtual modules
```

## 57.6 Finders

A finder locates a module.

Finders live mainly on `sys.meta_path`.

```python
import sys

for finder in sys.meta_path:
    print(finder)
```

Each finder may implement:

```python
find_spec(fullname, path=None, target=None)
```

The import system asks each finder whether it can find the requested module.

Conceptually:

```text
for finder in sys.meta_path:
    spec = finder.find_spec(fullname, path, target)
    if spec is not None:
        use spec
```

Common finder types include:

| Finder | Role |
|---|---|
| Built-in importer | Finds built-in modules |
| Frozen importer | Finds frozen modules |
| Path finder | Searches `sys.path` and package paths |
| Custom finder | User-provided import behavior |

Finders answer the question: “Can this module be found, and how should it be loaded?”

## 57.7 Loaders

A loader creates and executes a module.

Loader responsibilities may include:

```text
create module object
read source or binary data
compile source to code object
execute code object
initialize extension module
set module attributes
```

Modern loaders generally implement:

```python
create_module(spec)
exec_module(module)
```

Conceptually:

```text
spec.loader.create_module(spec)
    ↓
module object

spec.loader.exec_module(module)
    ↓
initialized module
```

If `create_module()` returns `None`, the import machinery creates a default module object.

`exec_module()` performs the actual initialization.

For a source module, this usually means:

```text
read .py file
compile source
execute code in module namespace
```

For an extension module, this means invoking native initialization code.

## 57.8 `PathFinder`

`PathFinder` is the main finder for ordinary filesystem imports.

It searches path entries from either:

```text
sys.path
```

or a package’s:

```text
__path__
```

Example:

```python
import importlib.machinery

print(importlib.machinery.PathFinder)
```

Resolution differs by context.

For top-level import:

```python
import json
```

the search path is:

```python
sys.path
```

For submodule import:

```python
import package.module
```

the search path is:

```python
package.__path__
```

This is why packages control where their submodules are found.

## 57.9 Path Hooks and Importer Cache

Filesystem path entries are not interpreted directly every time.

CPython uses:

```python
sys.path_hooks
sys.path_importer_cache
```

`sys.path_hooks` contains callables that know how to build importers for path entries.

`sys.path_importer_cache` caches the result.

Conceptually:

```text
path entry
    ↓
path hook creates path entry finder
    ↓
cache finder in sys.path_importer_cache
    ↓
reuse finder for later imports
```

This supports path entries such as:

```text
directories
zip files
custom virtual paths
```

The cache avoids rebuilding path entry finders repeatedly.

## 57.10 Source File Loading

A normal `.py` import uses a source file loader.

Conceptual flow:

```text
find module.py
    ↓
create module spec
    ↓
create module object
    ↓
read source
    ↓
compile source to code object
    ↓
execute code object in module namespace
    ↓
store module in sys.modules
```

Example file:

```python
# config.py
host = "localhost"
port = 5432
```

After import:

```python
import config

print(config.host)
print(config.port)
```

The top-level assignments ran during module execution and populated `config.__dict__`.

## 57.11 Bytecode Caches

CPython may cache compiled bytecode in `__pycache__`.

Example:

```text
module.py
__pycache__/module.cpython-313.pyc
```

A `.pyc` file stores compiled code plus validation metadata.

The cache avoids recompiling unchanged source on later imports.

Important point: `.pyc` is a cache, not the semantic source of truth for ordinary source imports.

Conceptually:

```text
if valid pyc exists:
    load code object from pyc
else:
    read source
    compile source
    write pyc if allowed
```

`importlib` contains the machinery for cache path calculation, validation, and bytecode loading.

## 57.12 Packages

A package is a module that can contain submodules.

Traditionally, a package is a directory with:

```text
package/
    __init__.py
    module.py
```

Importing the package executes `__init__.py`.

```python
import package
```

Importing a submodule searches the package path:

```python
import package.module
```

Package objects have:

```python
package.__path__
```

This attribute tells the import system where to look for submodules.

Conceptually:

```text
package.__path__
    ↓
submodule search locations
```

## 57.13 Namespace Packages

Namespace packages allow one package name to span multiple directories.

They do not require an `__init__.py`.

Example:

```text
dir1/acme/plugins/a.py
dir2/acme/plugins/b.py
```

If both `dir1` and `dir2` are on `sys.path`, `acme.plugins` can include both locations.

A namespace package’s `__path__` contains multiple entries.

This feature is useful for plugin systems and separately distributed package portions.

## 57.14 Relative Imports

Relative imports depend on package context.

Example:

```python
from . import util
from .models import User
from ..core import config
```

The import system uses `__package__` and `__spec__.parent` to resolve these names.

A relative import cannot be resolved from just the text `.models`. The import system needs to know the current package.

Conceptually:

```text
current package: app.views
relative import: .models
resolved name: app.views.models
```

This is why running package files directly can break relative imports. The module may lack correct package context.

## 57.15 `importlib.import_module()`

`importlib.import_module()` is the public dynamic import API.

```python
import importlib

name = "json.decoder"
mod = importlib.import_module(name)

print(mod)
```

It handles dotted names and package-relative imports.

Example:

```python
importlib.import_module(".decoder", package="json")
```

This resolves relative to `json`.

Unlike `__import__()`, `import_module()` returns the requested module rather than the top-level package.

```python
mod = importlib.import_module("json.decoder")
print(mod.__name__)
```

Output:

```text
json.decoder
```

## 57.16 Reloading Modules

`importlib.reload(module)` re-executes a module.

```python
import importlib
import config

importlib.reload(config)
```

Reloading keeps the same module object but re-executes its code.

Conceptually:

```text
existing module object
    ↓
reuse module.__dict__
    ↓
execute module code again
```

This can produce surprising behavior.

Old names may remain if the new source no longer defines them. Existing references held elsewhere still point to old objects.

Example:

```python
from config import SETTINGS

importlib.reload(config)
```

The name `SETTINGS` in the importing module is not automatically rebound.

Reload is useful for development tools and REPL workflows, but it is rarely clean enough for production hot reload without careful design.

## 57.17 Custom Importers

The import system is extensible.

A custom finder and loader can load modules from unusual sources.

Minimal sketch:

```python
import importlib.abc
import importlib.util
import sys
import types

class MemoryLoader(importlib.abc.Loader):
    def create_module(self, spec):
        return None

    def exec_module(self, module):
        module.answer = 42

class MemoryFinder(importlib.abc.MetaPathFinder):
    def find_spec(self, fullname, path, target=None):
        if fullname == "virtual_config":
            return importlib.util.spec_from_loader(fullname, MemoryLoader())
        return None

sys.meta_path.insert(0, MemoryFinder())

import virtual_config
print(virtual_config.answer)
```

Output:

```text
42
```

This shows that import does not require a file.

A module can come from memory, a database, a network service, a generated source string, or an embedded resource, provided a finder and loader implement the protocol.

## 57.18 Import Locks

Imports are synchronized.

CPython uses import locks to prevent multiple threads from initializing the same module at the same time.

Without locking, two threads could both observe a missing module, both create module objects, and both execute module code.

The practical guarantee is that module initialization is coordinated per interpreter.

This matters for multi-threaded programs that import modules lazily.

Import-time side effects should still be minimized, because import execution can block other imports and create ordering hazards.

## 57.19 Import-Time Execution

Importing a module executes its top-level code.

Example:

```python
# app.py
print("loading app")

value = 42
```

Then:

```python
import app
```

prints:

```text
loading app
```

This is why module top-level code should usually define names rather than perform expensive or irreversible actions.

Prefer:

```python
def main():
    ...

if __name__ == "__main__":
    main()
```

The import system is an execution system, not just a name lookup system.

## 57.20 Built-in and Frozen Modules

Not all modules come from files.

Built-in modules are compiled into CPython.

Frozen modules are stored as frozen code data inside the interpreter.

Examples:

```python
import sys
import importlib.machinery

print(sys.builtin_module_names)
print(importlib.machinery.BuiltinImporter)
print(importlib.machinery.FrozenImporter)
```

Built-in and frozen importers are usually present on `sys.meta_path`.

They allow the interpreter to import critical modules before filesystem import is fully available.

This is important during startup.

## 57.21 Extension Modules

Native extension modules are shared libraries loaded by CPython.

Examples:

```text
_module.cpython-313-x86_64-linux-gnu.so
_module.pyd
```

The import system finds the shared library, loads it, and calls its initialization function.

Conceptually:

```text
find shared library
    ↓
dynamic loader opens binary
    ↓
CPython calls PyInit_modulename
    ↓
module object returned or initialized
```

Extension module loading connects `importlib` to the C API and platform dynamic linking.

## 57.22 `importlib.util`

`importlib.util` provides helper functions.

Common APIs include:

| API | Purpose |
|---|---|
| `find_spec()` | Find a module spec |
| `module_from_spec()` | Create module from spec |
| `spec_from_file_location()` | Build spec for a file |
| `cache_from_source()` | Compute `.pyc` path |
| `source_from_cache()` | Recover source path from cache path |

Manual file import example:

```python
import importlib.util
import sys

path = "/tmp/plugin.py"
name = "plugin"

spec = importlib.util.spec_from_file_location(name, path)
module = importlib.util.module_from_spec(spec)

sys.modules[name] = module
spec.loader.exec_module(module)
```

This is the lower-level form of import. It gives the caller control over the module name, path, cache behavior, and registration.

## 57.23 `importlib.machinery`

`importlib.machinery` exposes concrete importer machinery.

Important objects include:

| Object | Role |
|---|---|
| `PathFinder` | Main path-based finder |
| `FileFinder` | Finder for filesystem directories |
| `SourceFileLoader` | Loads `.py` files |
| `SourcelessFileLoader` | Loads `.pyc` files |
| `ExtensionFileLoader` | Loads native extension modules |
| `BuiltinImporter` | Loads built-in modules |
| `FrozenImporter` | Loads frozen modules |
| `ModuleSpec` | Import specification object |

This module is useful when building custom import behavior that still wants to reuse CPython’s standard components.

## 57.24 `importlib.abc`

`importlib.abc` defines abstract base classes for import protocols.

Important classes include:

| ABC | Meaning |
|---|---|
| `MetaPathFinder` | Finder on `sys.meta_path` |
| `PathEntryFinder` | Finder for one path entry |
| `Loader` | Base loader protocol |
| `ResourceReader` | Legacy resource reading protocol |
| `InspectLoader` | Loader that can inspect code |
| `ExecutionLoader` | Loader that can execute code |
| `SourceLoader` | Loader for source code |

These ABCs document expected methods and support structured custom importers.

## 57.25 Resources

`importlib.resources` provides access to package data.

Example:

```python
from importlib import resources

text = resources.files("my_package").joinpath("data.txt").read_text()
```

This works for packages stored in places other than normal directories, such as zip files, as long as the loader supports the resource interface.

This avoids fragile code like:

```python
open(os.path.join(os.path.dirname(__file__), "data.txt"))
```

Package resources should be accessed through import-aware APIs when possible.

## 57.26 Invalidation Caches

Finders may cache directory listings or module discovery information.

`importlib.invalidate_caches()` asks import finders to clear those caches.

```python
import importlib

importlib.invalidate_caches()
```

This is useful when a program creates new modules on disk at runtime and then wants to import them.

Example:

```text
write plugin.py
    ↓
invalidate import caches
    ↓
import plugin
```

Without invalidation, a path finder may not notice the new file immediately.

## 57.27 Common Import Failure Modes

Import failures often come from path, naming, or initialization problems.

| Symptom | Common cause |
|---|---|
| `ModuleNotFoundError` | No finder found a spec |
| `ImportError` | Loader found module but failed to load requested object |
| Partially initialized module | Circular import |
| Relative import error | Missing package context |
| Wrong module imported | Unexpected earlier `sys.path` entry |
| Reload does not update references | Existing names still point to old objects |
| Package data missing | Files accessed outside import resource APIs |

A useful diagnostic sequence:

```python
import importlib.util
import sys

print(sys.path)
print(importlib.util.find_spec("some_module"))
```

If `find_spec()` returns `None`, discovery failed. If it returns a spec but import fails, loading or execution failed.

## 57.28 Relationship to `sys`

The import system depends heavily on `sys`.

Important `sys` objects include:

| Object | Import role |
|---|---|
| `sys.modules` | Module cache |
| `sys.path` | Top-level search path |
| `sys.meta_path` | Meta path finders |
| `sys.path_hooks` | Path entry importer factories |
| `sys.path_importer_cache` | Cached path entry finders |
| `sys.builtin_module_names` | Built-in module names |

`importlib` provides machinery. `sys` stores global interpreter import state.

## 57.29 Relationship to Code Objects

For source imports, `importlib` ultimately creates and executes code objects.

```text
read source
    ↓
compile(source, filename, "exec")
    ↓
code object
    ↓
exec(code, module.__dict__)
```

This connects import machinery to the compiler and interpreter.

A source module import is equivalent in broad shape to:

```python
module = types.ModuleType(name)
code = compile(source, filename, "exec")
exec(code, module.__dict__)
```

The real implementation includes specs, loaders, caches, locks, errors, packages, and edge cases.

## 57.30 Why `importlib` Matters for CPython Internals

`importlib` matters because imports are central to Python execution.

Almost every Python program starts by importing modules. CPython itself depends on imports during startup. Tooling, packaging, virtual environments, plugins, test runners, and application frameworks all rely on import behavior.

Understanding `importlib` explains:

```text
why sys.modules exists
why circular imports happen
how packages find submodules
how .py files become module objects
how bytecode caches are used
how extension modules are initialized
how custom import systems work
why import-time side effects matter
```

It also shows CPython’s design preference for exposing internal protocols as Python-level machinery. Imports are not hard-coded as one filesystem algorithm. They are a layered protocol built from finders, loaders, specs, caches, and module objects.

## 57.31 Chapter Summary

The `importlib` module is the Python-level interface to CPython’s import system. It exposes dynamic imports, module specifications, finders, loaders, path-based search, bytecode caches, package resources, reload behavior, and custom importer protocols.

For CPython internals, `importlib` is important because it connects source files, module objects, code objects, `sys.modules`, `sys.path`, extension modules, built-in modules, packages, and interpreter startup into one coherent subsystem.
