# 40. Modules and Imports

# 40. Modules and Imports

A module is Python’s basic unit of code loading, namespace isolation, and reuse. In CPython, a module is both a language-level object and a runtime record in the import system.

At the Python level, a module is what you get after executing:

```python
import math
import os
import json
```

Each imported name is bound to a module object, package object, function, class, or other exported object. At the CPython level, import is a coordinated process involving bytecode instructions, import hooks, module specifications, loaders, finders, `sys.modules`, package paths, file-system lookup, bytecode caches, import locks, and module execution.

The import system is not a simple file include mechanism. It is a runtime protocol.

## 40.1 What a Module Is

A module is an object of type `module`.

```python
import sys

print(type(sys))
print(sys.__name__)
```

Output:

```text
<class 'module'>
sys
```

A module object owns a dictionary. That dictionary is the module’s global namespace.

```python
import math

print(math.__dict__["pi"])
print(math.__dict__["sqrt"])
```

When CPython executes a module file, top-level assignments write into this dictionary.

For a file named `config.py`:

```python
debug = True
port = 8080

def connect():
    return port
```

CPython creates a module object, prepares its namespace, executes the compiled code object inside that namespace, and leaves the resulting bindings in `config.__dict__`.

Conceptually:

```text
module object
    __dict__
        "__name__"      -> "config"
        "__file__"      -> ".../config.py"
        "__spec__"      -> ModuleSpec(...)
        "debug"         -> True
        "port"          -> 8080
        "connect"       -> function object
```

A module is therefore a mutable namespace object.

## 40.2 Module Objects in CPython

In CPython, module objects are implemented by the `PyModuleObject` type.

A simplified mental model is:

```c
typedef struct {
    PyObject_HEAD
    PyObject *md_dict;
    PyObject *md_name;
    PyObject *md_doc;
    PyObject *md_state;
    PyObject *md_weaklist;
    PyModuleDef *md_def;
} PyModuleObject;
```

The exact fields can change, but the important point is stable: a module has an associated dictionary and optional C-level module definition/state.

The dictionary stores ordinary Python names. When Python code evaluates a global name inside a module, CPython usually looks in that module dictionary first.

For example:

```python
x = 10

def f():
    return x
```

The function `f` does not copy `x`. It stores a reference to the module globals dictionary through its function object. When `f` executes `return x`, CPython resolves `x` using global lookup against that dictionary.

## 40.3 Import Is Execution

Importing a Python module executes its top-level code.

For `example.py`:

```python
print("loading example")

value = 42

def get_value():
    return value
```

The first import executes the file:

```python
import example
```

Output:

```text
loading example
```

A second import usually does not execute the file again:

```python
import example
```

No output appears because CPython finds the existing module in `sys.modules`.

This behavior is fundamental. Import has side effects because module top-level code runs.

Good module top-level code usually contains definitions and cheap initialization:

```python
CONSTANT = 100

def parse(text):
    ...
```

Risky module top-level code performs expensive or externally visible work:

```python
connect_to_database()
delete_old_files()
start_threads()
make_network_request()
```

Such code runs during import, sometimes before the application is fully initialized.

## 40.4 The Import Statement

The statement:

```python
import package.module
```

does not directly mean “open this file.”

It means:

```text
resolve a module name
find a module specification
create or reuse a module object
initialize import-related attributes
execute the module if needed
bind a name in the caller's namespace
```

The statement:

```python
import os.path
```

usually binds `os`, not `os.path`, in the local namespace:

```python
import os.path

print(os)
print(os.path)
```

The statement:

```python
from os import path
```

binds `path` directly:

```python
from os import path

print(path)
```

The statement:

```python
from math import sqrt
```

imports the module if needed, then retrieves `sqrt` from that module and binds it in the caller’s namespace.

## 40.5 Bytecode for Import

CPython compiles import statements into bytecode.

Example:

```python
import math
```

The compiled code uses import-related bytecode instructions. The exact instruction sequence varies by Python version, but conceptually it does this:

```text
load import machinery
import module named "math"
bind result to name "math"
```

For:

```python
from math import sqrt
```

the bytecode conceptually does this:

```text
import module named "math"
load attribute "sqrt"
bind local/global name "sqrt"
```

You can inspect this with `dis`:

```python
import dis

def f():
    import math
    return math.sqrt(9)

dis.dis(f)
```

The import statement is part of normal bytecode execution. There is no separate preprocessor step.

## 40.6 `__import__`

At the language level, import statements eventually route through import machinery exposed through `builtins.__import__`.

```python
import builtins

print(builtins.__import__)
```

You can call it directly:

```python
math_module = __import__("math")
print(math_module.sqrt(9))
```

But most code should not call `__import__` directly. Use `importlib.import_module` for dynamic imports:

```python
import importlib

mod = importlib.import_module("math")
print(mod.sqrt(9))
```

The function `__import__` exists because import is dynamic. Python code can import modules by string name at runtime.

## 40.7 `sys.modules`

`sys.modules` is the central module cache.

It is a dictionary mapping fully qualified module names to module objects.

```python
import sys
import math

print(sys.modules["math"] is math)
```

Output:

```text
True
```

Before loading a module, the import system checks `sys.modules`.

Conceptually:

```python
if fullname in sys.modules:
    return sys.modules[fullname]
else:
    module = load_module(fullname)
    sys.modules[fullname] = module
    return module
```

The real process is more careful because it must handle packages, circular imports, failed imports, locks, and loader protocols.

The key property remains: imports are cached by module name.

## 40.8 Why Modules Are Inserted Before Execution

CPython usually inserts a module into `sys.modules` before executing its code.

This is necessary for circular imports.

Suppose `a.py` contains:

```python
import b

x = 1
```

and `b.py` contains:

```python
import a

y = 2
```

When `a` imports `b`, and `b` imports `a`, the import system must avoid infinite recursion. It does this by placing the partially initialized module object into `sys.modules`.

The tradeoff is that circular imports can observe incomplete modules.

Example:

```python
# a.py
import b

x = 1
```

```python
# b.py
import a

print(a.x)
```

This may fail because `a.x` has not been assigned yet when `b` reads it.

Circular imports are not forbidden, but they require care. The usual fix is to move imports inside functions, move shared definitions into a third module, or avoid top-level cross-dependencies.

## 40.9 Module Initialization Sequence

A simplified import sequence for a Python source module looks like this:

```text
1. Receive module name, such as "pkg.mod".
2. Check sys.modules.
3. Search sys.meta_path for a finder.
4. Finder returns a ModuleSpec.
5. Import machinery creates a module object.
6. Module is inserted into sys.modules.
7. Loader executes module code.
8. Import machinery returns the module object.
9. Import statement binds names in caller namespace.
```

For a source file, execution means:

```text
read source
decode source
compile source to code object
execute code object in module namespace
```

For an extension module, execution means calling native initialization code.

For a built-in module, execution uses built-in initialization logic compiled into CPython.

## 40.10 `ModuleSpec`

Modern Python import uses `ModuleSpec` objects to describe how a module should be loaded.

A module spec contains information such as:

```text
module name
loader
origin
package search locations
cached bytecode path
whether the module is a package
```

You can inspect a module’s spec:

```python
import json

print(json.__spec__)
print(json.__spec__.name)
print(json.__spec__.origin)
print(json.__spec__.loader)
print(json.__spec__.submodule_search_locations)
```

For a normal module, `submodule_search_locations` is usually `None`.

For a package, it contains paths where submodules may be found.

## 40.11 Finders and Loaders

The import system separates finding from loading.

A finder answers:

```text
Can this module name be found?
If yes, what spec describes it?
```

A loader answers:

```text
How should this module be created and executed?
```

This separation allows Python to import from many places:

```text
source files
bytecode files
built-in modules
extension modules
zip archives
namespace packages
custom import hooks
memory-backed module stores
remote systems, if a custom importer implements it
```

The standard import system is extensible because it is protocol-based.

## 40.12 `sys.meta_path`

`sys.meta_path` is the first major hook point in the import system.

It is a list of finder objects. Each finder can decide whether it knows how to handle a module name.

```python
import sys

for finder in sys.meta_path:
    print(finder)
```

A simplified import search does this:

```python
for finder in sys.meta_path:
    spec = finder.find_spec(fullname, path, target)
    if spec is not None:
        return spec
```

Typical entries handle:

```text
built-in modules
frozen modules
path-based modules
```

The path-based finder is responsible for searching directories and other path entries.

## 40.13 `sys.path`

`sys.path` is the list of import search locations for top-level modules.

```python
import sys

for entry in sys.path:
    print(entry)
```

When you write:

```python
import mymodule
```

and `mymodule` is not built in or frozen, the path-based import system searches entries in `sys.path`.

Entries are usually:

```text
directory of the running script
current working directory in interactive mode
standard library directories
site-packages directories
paths from PYTHONPATH
virtual environment paths
zip archives
```

This is why changing `sys.path` changes import behavior.

```python
import sys

sys.path.insert(0, "/custom/modules")

import mymodule
```

This can be useful in controlled tools, but it can also create fragile import behavior.

## 40.14 Packages

A package is a module that can contain submodules.

Historically, a directory became a package by containing an `__init__.py` file:

```text
pkg/
    __init__.py
    parser.py
    lexer.py
```

Then:

```python
import pkg.parser
```

loads `pkg` first, then `pkg.parser`.

The file `pkg/__init__.py` executes when the package is imported.

For example:

```python
# pkg/__init__.py
print("loading package")
version = "1.0"
```

```python
import pkg
print(pkg.version)
```

A package is still a module object. The difference is that it has package search locations.

## 40.15 Package Attributes

Packages usually define import-related attributes:

```text
__name__
__package__
__path__
__spec__
__file__
__cached__
```

The important package-specific attribute is `__path__`.

```python
import package

print(package.__path__)
```

`__path__` tells the import system where to search for submodules inside that package.

For:

```python
import package.submodule
```

the import system searches `package.__path__`, not the top-level `sys.path`.

## 40.16 Namespace Packages

Python supports namespace packages. These are packages without a single `__init__.py` file.

A namespace package can be spread across multiple directories.

Example:

```text
dir1/
    plugins/
        alpha.py

dir2/
    plugins/
        beta.py
```

If both `dir1` and `dir2` are on `sys.path`, Python can treat `plugins` as a namespace package.

Then both may work:

```python
import plugins.alpha
import plugins.beta
```

Namespace packages are useful for plugin systems and separately distributed package fragments.

They also make import resolution more complex because a package may have multiple search locations.

## 40.17 Absolute Imports

An absolute import starts from the top-level import namespace.

```python
import package.module
from package import module
```

Inside a package, this still searches for a top-level package named `package`.

Absolute imports are preferred for clarity when referring to external or top-level modules.

Example:

```python
from project.config import Settings
```

This tells the reader where the imported name comes from.

## 40.18 Relative Imports

A relative import is resolved against the current package.

```python
from . import parser
from .lexer import tokenize
from ..config import Settings
```

Relative imports depend on `__package__`.

They only work when the module is executed as part of a package. This is why running a package module directly as a script can break relative imports.

For example:

```text
project/
    app/
        __init__.py
        main.py
        config.py
```

Inside `main.py`:

```python
from .config import Settings
```

This works when executed with:

```bash
python -m app.main
```

It may fail when executed as:

```bash
python app/main.py
```

because direct script execution changes how the module is named and packaged.

## 40.19 `__main__`

The module executed as the program entry point is named `__main__`.

```python
print(__name__)
```

When run as a script:

```bash
python script.py
```

Output:

```text
__main__
```

When imported:

```python
import script
```

the module name is:

```text
script
```

This is why Python programs commonly use:

```python
def main():
    ...

if __name__ == "__main__":
    main()
```

The guard prevents program entry code from running during import.

## 40.20 Running Modules With `-m`

The command:

```bash
python -m package.module
```

runs a module by import name rather than by file path.

This matters because the module gets correct package context.

For package code, prefer:

```bash
python -m package.module
```

over:

```bash
python package/module.py
```

The `-m` form allows relative imports to work because CPython knows the module’s package.

## 40.21 Bytecode Cache Files

CPython may cache compiled bytecode in `__pycache__`.

Example:

```text
package/
    module.py
    __pycache__/
        module.cpython-312.pyc
```

The bytecode cache avoids recompiling source on every import when the cache is valid.

A `.pyc` file contains:

```text
magic number
cache metadata
marshaled code object
```

The magic number identifies the bytecode format version. It changes when CPython changes bytecode incompatibly.

CPython checks cache validity using timestamp-based or hash-based invalidation, depending on how the file was produced.

## 40.22 Source Modules and Code Objects

For a normal `.py` module, the loader compiles source into a code object.

Then it executes the code object in the module dictionary.

Conceptually:

```python
module = types.ModuleType("example")
code = compile(source_text, filename, "exec")
exec(code, module.__dict__)
```

This is close to the real model, although the actual import system handles many more details.

The important point is that module execution is ordinary code execution with a module dictionary as the global namespace.

## 40.23 Built-in Modules

Built-in modules are compiled into the CPython executable or linked runtime.

Examples often include:

```python
import sys
import builtins
import time
```

Built-in module availability depends on platform and build configuration.

A built-in module does not require locating a `.py` file. Its loader initializes it from C-level definitions.

You can inspect built-in module names:

```python
import sys

print(sys.builtin_module_names)
```

Built-in modules provide core runtime services needed before the full file-based import system is available.

## 40.24 Frozen Modules

Frozen modules are Python modules embedded into the CPython binary as frozen bytecode or equivalent static data.

They help bootstrap import machinery before the file system import system is fully operational.

This creates a bootstrapping problem:

```text
importlib implements imports
but importlib itself must be imported
so parts of importlib are frozen
```

Frozen modules solve this cycle by making selected Python code available without ordinary source-file import.

## 40.25 Extension Modules

Extension modules are native shared libraries loaded into CPython.

On Unix-like systems, these are commonly `.so` files. On Windows, they are commonly `.pyd` files.

Example imports:

```python
import _sqlite3
import _ssl
import _hashlib
```

An extension module provides an initialization function called by CPython. Modern extension modules use multi-phase initialization when possible.

Extension modules must follow CPython’s C API rules:

```text
create module object
define methods
manage reference ownership
set exceptions on failure
return initialized module
```

Because extension modules run native code inside the Python process, bugs can crash the interpreter.

## 40.26 Single-Phase and Multi-Phase Initialization

Older extension modules often use single-phase initialization. The initialization function creates and returns a module object in one step.

Modern extension modules can use multi-phase initialization. In this model, module creation and module execution are separated.

This better matches Python-level import semantics and supports per-module state more cleanly.

Multi-phase initialization is important for:

```text
subinterpreter compatibility
module reloading behavior
cleaner module state
avoiding process-global mutable state
future isolation improvements
```

A C extension that stores all state in global C variables may work in simple cases, but it can behave badly with subinterpreters or repeated initialization.

## 40.27 Import Locks

Imports need locking.

Without locking, two threads could try to import and initialize the same module at the same time.

The import system uses locks to ensure that a module is not executed concurrently by multiple threads in unsafe ways.

This matters especially when module initialization has side effects:

```python
# database.py
connection_pool = create_pool()
```

If two threads imported this module concurrently without locking, they might create duplicate global state or observe partial initialization.

The import lock prevents many such races.

## 40.28 Reloading Modules

Python can reload a module using `importlib.reload`.

```python
import importlib
import config

importlib.reload(config)
```

Reloading re-executes the module code using the existing module object.

This has subtle consequences.

Suppose `config.py` originally contains:

```python
value = 1
```

After editing it to:

```python
value = 2
```

a reload updates `config.value`.

But existing references elsewhere may still point to old objects.

```python
from config import value

import config
import importlib

importlib.reload(config)

print(value)        # old binding
print(config.value) # new module attribute
```

Reload is useful for development tools, notebooks, and plugin systems, but it is not a full process reset.

## 40.29 Import Side Effects

Import side effects are often the source of confusing behavior.

This module has a visible side effect:

```python
# noisy.py
print("imported noisy")
```

This module has a hidden side effect:

```python
# registry.py
handlers = {}

def register(name, fn):
    handlers[name] = fn
```

```python
# plugin.py
from registry import register

def handle(x):
    return x

register("plugin", handle)
```

Importing `plugin` mutates `registry.handlers`.

This pattern is common in plugin systems, ORMs, test frameworks, and web frameworks. It can be useful, but it means import order becomes part of program behavior.

## 40.30 Lazy Imports

A lazy import delays importing a module until it is needed.

Example:

```python
def parse_json(text):
    import json
    return json.loads(text)
```

This can reduce startup time or avoid optional dependencies during unused code paths.

But lazy imports have tradeoffs:

```text
errors appear later
first call may become slower
dependency structure becomes less visible
circular imports may be hidden rather than fixed
```

Lazy imports are useful when used deliberately. They should not become a default workaround for poor module structure.

## 40.31 Optional Imports

Optional imports are common for feature detection.

```python
try:
    import uvloop
except ImportError:
    uvloop = None
```

Be careful with broad exception handling. This is often wrong:

```python
try:
    import plugin
except Exception:
    plugin = None
```

It hides real bugs inside `plugin`.

Prefer catching `ImportError` or `ModuleNotFoundError` narrowly, and when needed, verify which module failed.

```python
try:
    import optional_backend
except ModuleNotFoundError as exc:
    if exc.name != "optional_backend":
        raise
    optional_backend = None
```

This avoids hiding missing transitive dependencies.

## 40.32 Import Name Binding

Different import forms bind different names.

| Statement | Bound name |
|---|---|
| `import os` | `os` |
| `import os.path` | `os` |
| `import os.path as p` | `p` |
| `from os import path` | `path` |
| `from math import sqrt as s` | `s` |
| `from module import *` | many names |

The import system loads modules. The import statement then binds names in the current namespace.

These are related but separate operations.

## 40.33 Star Imports

A star import copies exported names into the current namespace.

```python
from module import *
```

If the module defines `__all__`, Python imports those names.

```python
__all__ = ["connect", "close"]
```

Without `__all__`, Python imports names that do not start with an underscore.

Star imports are usually discouraged outside interactive sessions and package facade modules because they obscure where names come from.

A controlled package facade may use them carefully:

```python
# package/__init__.py
from .client import Client
from .errors import PackageError

__all__ = ["Client", "PackageError"]
```

## 40.34 Package Facades

A package can re-export names from submodules.

```python
# library/__init__.py
from .client import Client
from .config import Config

__all__ = ["Client", "Config"]
```

Then users can write:

```python
from library import Client
```

instead of:

```python
from library.client import Client
```

This improves API ergonomics but can increase import cost. If `library.__init__` imports many heavy submodules, then `import library` becomes expensive.

Good package facades balance convenience and startup cost.

## 40.35 Import Performance

Import time matters for command-line tools, server cold starts, tests, and short-running scripts.

Import cost comes from:

```text
file-system searches
source decoding
bytecode validation
compilation if cache is missing
module execution
transitive imports
native extension loading
top-level initialization
```

You can inspect import timing with:

```bash
python -X importtime -c "import your_package"
```

This prints a tree of import timings.

Common ways to improve import performance:

```text
avoid heavy top-level work
delay optional imports
reduce large dependency chains
avoid importing test-only modules at runtime
keep package __init__.py small
avoid broad convenience imports in hot paths
```

## 40.36 Import Errors

The common import-related exceptions are:

| Exception | Meaning |
|---|---|
| `ModuleNotFoundError` | A requested module could not be found |
| `ImportError` | Import failed for a broader reason |
| `AttributeError` | Module loaded, but requested attribute does not exist |
| `SyntaxError` | Source module could not be compiled |
| Native load error | Extension module failed to load |

Example:

```python
from package import missing_name
```

If `package` exists but `missing_name` does not, Python may raise `ImportError`.

Example:

```python
import missing_package
```

Usually raises:

```text
ModuleNotFoundError
```

Import failure diagnosis should distinguish:

```text
the target module is missing
a transitive dependency is missing
the module exists but raised during execution
the requested exported name is missing
a native extension failed to load
```

## 40.37 Circular Imports

Circular imports happen when modules depend on each other during top-level execution.

Example:

```python
# users.py
from posts import Post

class User:
    ...
```

```python
# posts.py
from users import User

class Post:
    ...
```

This can fail because each module needs the other before either has finished initializing.

Common fixes:

1. Move shared types into a third module.

```text
models/
    base.py
    users.py
    posts.py
```

2. Use local imports for runtime-only dependencies.

```python
def create_post():
    from posts import Post
    return Post()
```

3. Use postponed type annotations.

```python
from __future__ import annotations

class User:
    posts: list[Post]
```

4. Depend on interfaces rather than concrete modules.

The best fix is usually structural. Circular imports often reveal that module boundaries are poorly chosen.

## 40.38 Import and Type Checking

Type hints can create import cycles if annotations import runtime objects.

A common pattern is:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from posts import Post

class User:
    def add_post(self, post: "Post") -> None:
        ...
```

`TYPE_CHECKING` is false at runtime, so the import is visible to type checkers but skipped during execution.

This reduces runtime import cycles while keeping type information.

Modern Python also supports postponed annotation evaluation in several contexts, which further reduces runtime imports for typing.

## 40.39 Import Hooks

Import hooks let programs customize import behavior.

A custom finder can be placed on `sys.meta_path`.

A custom loader can create and execute modules from nonstandard sources.

Use cases include:

```text
zip import
plugin systems
test isolation
sandboxed module loading
import tracing
encrypted module stores
remote module stores
generated modules
```

A minimal finder skeleton looks like:

```python
class Finder:
    def find_spec(self, fullname, path=None, target=None):
        if fullname == "virtual_module":
            ...
        return None
```

A loader must implement the appropriate loader protocol, usually including module creation or execution.

Import hooks are powerful. They affect global program behavior, so they should be narrow and predictable.

## 40.40 `importlib`

`importlib` is the standard library interface to the import system.

Common operations:

```python
import importlib

mod = importlib.import_module("json")
mod = importlib.reload(mod)
```

Useful lower-level pieces include:

```text
importlib.util.find_spec
importlib.util.module_from_spec
spec.loader.exec_module
importlib.machinery.PathFinder
importlib.machinery.SourceFileLoader
```

Manual loading can look like this:

```python
import importlib.util

spec = importlib.util.spec_from_file_location("custom_name", "/path/to/file.py")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
```

This creates and executes a module from a specific file.

For normal application code, prefer ordinary imports. Use manual importlib loading only when building tools, plugin systems, loaders, or runtime module systems.

## 40.41 Module Identity

Module identity is based on the key in `sys.modules`.

If the same file is imported under two different names, CPython can create two distinct module objects.

Example problem:

```text
project/
    package/
        __init__.py
        settings.py
```

If one part of the program imports:

```python
import package.settings
```

and another path manipulation causes:

```python
import settings
```

then the same file may be loaded twice under different names.

That can duplicate module globals:

```text
two registries
two singleton objects
two class identities
two caches
```

This is one reason direct `sys.path` manipulation can be dangerous.

## 40.42 Module Globals Are Shared State

A module-level variable is shared by all code that imports the module.

```python
# state.py
count = 0

def increment():
    global count
    count += 1
    return count
```

Every importer observes the same module object:

```python
import state

state.increment()
state.increment()
```

This is useful for constants, registries, caches, and singletons. It can also make testing harder because state persists across imports.

A test may need to reset module state explicitly:

```python
import state

def test_increment():
    state.count = 0
    assert state.increment() == 1
```

Or isolate behavior by avoiding mutable module globals.

## 40.43 Import-Time Configuration

Modules often read configuration at import time:

```python
import os

DEBUG = os.environ.get("DEBUG") == "1"
```

This makes configuration fixed at import time. If the environment changes later, `DEBUG` does not update automatically.

A more flexible design reads configuration when needed:

```python
import os

def debug_enabled():
    return os.environ.get("DEBUG") == "1"
```

or centralizes configuration loading:

```python
class Settings:
    def __init__(self):
        self.debug = os.environ.get("DEBUG") == "1"

settings = Settings()
```

Import-time configuration is simple, but it can surprise tests and long-running programs.

## 40.44 Module-Level `__getattr__`

Modules can define `__getattr__` to customize attribute access for missing names.

```python
# package/__init__.py

def __getattr__(name):
    if name == "heavy":
        from . import heavy
        return heavy
    raise AttributeError(name)
```

This can implement lazy exports.

Then:

```python
import package

package.heavy
```

imports the heavy submodule only when requested.

Module-level `__getattr__` is useful for compatibility shims, deprecations, and lazy loading. It should remain simple because it changes normal attribute access behavior.

## 40.45 Module-Level `__dir__`

A module can also define `__dir__`.

```python
def __dir__():
    return ["Client", "Config", "connect"]
```

This controls what appears in:

```python
dir(module)
```

It is mainly useful for modules with dynamic attributes.

## 40.46 The Import System During Startup

During CPython startup, the import system itself must be initialized carefully.

The runtime needs enough import machinery to load the standard library, but much of the import system is written in Python.

The bootstrap sequence uses built-in and frozen modules to bring up `importlib`.

Conceptually:

```text
initialize runtime
initialize builtins and sys
initialize frozen importlib bootstrap code
configure import machinery
initialize sys.path
load site if enabled
start executing user code
```

This is why startup internals are more constrained than ordinary runtime imports.

## 40.47 `site` and Environment Setup

After core import machinery is initialized, CPython commonly imports the `site` module unless disabled with `-S`.

The `site` module configures additional import paths, including site-packages directories.

It may also process:

```text
.pth files
user site-packages
virtual environment path adjustments
sitecustomize
usercustomize
```

This means the import environment at application start depends on interpreter flags, virtual environments, installation layout, and environment variables.

## 40.48 Virtual Environments and Imports

A virtual environment changes where Python looks for installed packages.

It usually changes:

```text
sys.prefix
sys.exec_prefix
site-packages paths
script entry points
```

The interpreter binary may be shared or copied, but the import environment points to the virtual environment’s package directories.

This is why:

```bash
python -m pip install requests
```

inside a virtual environment makes `requests` importable only inside that environment.

The import system itself is the same. The search paths differ.

## 40.49 Importing From Zip Files

Python can import modules from zip archives if the archive is on `sys.path`.

Example:

```bash
python app.zip
```

or:

```python
import sys

sys.path.insert(0, "modules.zip")
import mymodule
```

Zip import uses an importer that knows how to find module files inside the archive.

This demonstrates why import is path-entry based rather than only directory based. A `sys.path` entry can be handled by a custom path hook.

## 40.50 Path Hooks and Path Importers

For path-based imports, CPython uses path hooks to turn `sys.path` entries into importer objects.

Conceptually:

```text
sys.path entry
    ↓
sys.path_hooks
    ↓
path importer
    ↓
find module spec
```

The cache `sys.path_importer_cache` stores importer objects for path entries.

```python
import sys

print(sys.path_hooks)
print(sys.path_importer_cache)
```

This avoids rebuilding importer objects repeatedly.

## 40.51 Import Security Concerns

Import searches paths. That makes path order security-sensitive.

If the current directory appears before the standard library, a local file can shadow a standard module.

Example:

```text
project/
    json.py
```

Then:

```python
import json
```

may import the local `json.py` instead of the standard library `json`.

This can cause bugs or security problems.

Defensive practices:

```text
avoid naming files after standard library modules
avoid unsafe sys.path insertion
run applications from expected working directories
use virtual environments
inspect module.__file__ when debugging
prefer python -m package.module for package code
```

To inspect what was imported:

```python
import json

print(json.__file__)
```

## 40.52 Import and Testing

Tests often stress import behavior.

Common test issues include:

```text
tests depend on working directory
local files shadow installed packages
package imported twice under different names
module global state leaks between tests
environment variables read at import time
plugins register themselves during import
```

A robust test setup imports the package the same way users do.

Prefer testing installed package behavior:

```bash
python -m pytest
```

from a clean environment, rather than relying on incidental path layout.

## 40.53 Import and Application Design

Good Python application structure reduces import complexity.

A common layout:

```text
project/
    pyproject.toml
    src/
        app/
            __init__.py
            main.py
            config.py
            service.py
            storage.py
    tests/
        test_service.py
```

The `src` layout helps catch accidental imports from the repository root.

A clean module dependency graph points inward:

```text
main
    depends on service
service
    depends on storage and config
storage
    depends on database driver
config
    depends on environment parsing
```

Avoid designs where low-level modules import high-level application entry points.

## 40.54 Import and Public API Design

A package’s import surface is part of its public API.

For example:

```python
from library import Client
```

is a public contract if documented.

Changing the internal module location should not break users if the package facade preserves the public import:

```python
# library/__init__.py
from ._client import Client

__all__ = ["Client"]
```

Private modules often use a leading underscore:

```text
library/
    __init__.py
    _client.py
    _protocol.py
    public.py
```

This is a convention, not an access restriction.

## 40.55 Import Debugging Checklist

When import behavior is confusing, inspect these values:

```python
import sys
import module

print(module)
print(module.__name__)
print(getattr(module, "__file__", None))
print(getattr(module, "__spec__", None))
print(getattr(module, "__package__", None))
print(sys.path)
```

For package issues:

```python
import package

print(package.__path__)
print(package.__spec__.submodule_search_locations)
```

For cache issues:

```python
import sys

print(sys.modules.get("module_name"))
```

For timing:

```bash
python -X importtime -c "import module_name"
```

For direct resolution:

```python
import importlib.util

print(importlib.util.find_spec("module_name"))
```

## 40.56 Minimal Import Algorithm

A simplified import function can be written as:

```python
def import_module(fullname):
    if fullname in sys.modules:
        return sys.modules[fullname]

    spec = find_spec(fullname)
    if spec is None:
        raise ModuleNotFoundError(fullname)

    module = module_from_spec(spec)
    sys.modules[fullname] = module

    try:
        spec.loader.exec_module(module)
    except Exception:
        del sys.modules[fullname]
        raise

    return module
```

The real CPython import system is more complex, but this skeleton captures the central flow:

```text
cache lookup
spec discovery
module creation
cache insertion
module execution
error cleanup
return module
```

## 40.57 Common Failure: Partially Initialized Module

A common error looks like:

```text
AttributeError: partially initialized module 'x' has no attribute 'y'
```

This often means a circular import or module shadowing problem.

Example circular import:

```python
# a.py
import b

class A:
    ...
```

```python
# b.py
import a

class B(a.A):
    ...
```

When `b` reads `a.A`, module `a` exists in `sys.modules`, but its class `A` has not yet been defined.

The fix is usually to restructure the modules so that class definitions do not require importing each other during top-level execution.

## 40.58 Common Failure: Shadowing

If a file is named after a standard library module, imports may resolve to the wrong file.

Example:

```text
random.py
```

Inside it:

```python
import random
```

This may import itself instead of the standard library `random`.

Symptoms include:

```text
partially initialized module
missing expected attributes
recursive import behavior
strange module.__file__
```

Check:

```python
import random
print(random.__file__)
```

Rename the local file and remove stale cache files if needed.

## 40.59 Common Failure: Running a Package File Directly

Given:

```text
app/
    __init__.py
    main.py
    config.py
```

Inside `main.py`:

```python
from .config import Settings
```

This may fail:

```bash
python app/main.py
```

because `main.py` is executed as `__main__`, not as `app.main`.

Use:

```bash
python -m app.main
```

from the directory containing `app`.

This preserves package context and makes relative imports work.

## 40.60 Key Points

A module is a runtime object with a namespace dictionary.

Importing a module executes its top-level code once per module name in `sys.modules`.

The import system is built around finders, loaders, module specs, path hooks, and caches.

Packages are modules with submodule search locations.

Circular imports expose partially initialized modules because CPython inserts modules into `sys.modules` before execution.

The import system is programmable through `importlib`, `sys.meta_path`, path hooks, and loaders.

Most import problems come from circular dependencies, path shadowing, direct execution of package files, import-time side effects, or duplicated module identity.
