# 97. Import System Edge Cases

# 97. Import System Edge Cases

The CPython import system is the machinery that finds, loads, initializes, caches, and returns modules. It looks simple at the Python level:

```python id="97a1"
import json
```

But internally, import is one of CPython’s most complex runtime systems.

It must handle:

```text id="97a2"
source files
bytecode cache files
built-in modules
frozen modules
extension modules
packages
namespace packages
relative imports
import hooks
path hooks
module caching
reloads
partially initialized modules
circular imports
subinterpreters
thread synchronization
```

This chapter focuses on edge cases. These are the cases that explain why the import system has so many layers.

## 97.1 Import Is a Runtime Protocol

Import is not just a filesystem operation.

It is a runtime protocol built around finders, specs, loaders, module objects, and caches.

Conceptually:

```text id="97a3"
import name
    ↓
check sys.modules
    ↓
find module spec
    ↓
create module object
    ↓
insert module into sys.modules
    ↓
execute module code
    ↓
bind name in caller
```

The important point is that module discovery and module execution are separate.

A finder answers:

```text id="97a4"
Can this module be found?
Where is it?
What loader should load it?
Is it a package?
```

A loader answers:

```text id="97a5"
How should the module object be created?
How should its code be executed?
```

This separation allows imports from files, zip archives, frozen code, built-ins, networked systems, generated modules, and custom import mechanisms.

## 97.2 `sys.modules` Is the First Cache

Before CPython searches the filesystem or calls import hooks, it checks `sys.modules`.

```python id="97a6"
import sys

print(sys.modules["sys"])
```

`sys.modules` maps module names to module objects.

If the requested module already exists there, import usually returns it immediately.

This gives import important properties:

```text id="97a7"
modules execute once
future imports reuse the same module object
module globals preserve state
circular imports can terminate
```

Example:

```python id="97a8"
# config.py
value = 10
```

```python id="97a9"
import config
config.value = 20

import config
print(config.value)
```

The second import sees the same module object.

## 97.3 Partially Initialized Modules

A subtle edge case appears during module execution.

CPython inserts the module into `sys.modules` before executing the module body.

Conceptually:

```text id="97a10"
create module object
insert into sys.modules
execute module code
```

This is necessary for circular imports.

But it means other code may observe a partially initialized module.

Example:

```python id="97a11"
# a.py
import b

x = 1
```

```python id="97a12"
# b.py
import a

print(a.x)
```

When `b` imports `a`, `a` may already exist in `sys.modules`, but `x` may not yet be defined.

This can produce:

```text id="97a13"
AttributeError: partially initialized module ...
```

The module object exists, but its top-level code has not finished.

## 97.4 Circular Imports

Circular imports happen when modules import each other directly or indirectly.

```text id="97a14"
a imports b
b imports c
c imports a
```

Circular imports are allowed, but they are fragile when modules access names too early.

Bad pattern:

```python id="97a15"
# user.py
from service import create_service

class User:
    pass
```

```python id="97a16"
# service.py
from user import User

def create_service():
    return User()
```

This may fail depending on execution order.

Safer pattern:

```python id="97a17"
# service.py
def create_service():
    from user import User
    return User()
```

Moving imports inside functions delays name resolution until both modules have initialized.

## 97.5 `import module` vs `from module import name`

These two forms behave differently in circular imports.

```python id="97a18"
import a
```

binds the module object.

```python id="97a19"
from a import x
```

requires `x` to exist at import time.

In circular cases, this difference matters.

If module `a` is only partially initialized, `import a` may succeed, while `from a import x` may fail because `x` has not been assigned yet.

This is why plain module imports are often more robust in tangled module graphs.

## 97.6 Failed Imports and `sys.modules`

If module execution fails, CPython usually removes the failing module from `sys.modules`.

Example:

```python id="97a20"
# broken.py
raise RuntimeError("import failed")
```

```python id="97a21"
try:
    import broken
except RuntimeError:
    pass

import sys
print("broken" in sys.modules)
```

The expected result is usually:

```text id="97a22"
False
```

This prevents future imports from reusing a broken partially initialized module.

However, modules imported successfully as side effects may remain in `sys.modules`.

```python id="97a23"
# broken.py
import helper
raise RuntimeError("import failed")
```

After failure, `broken` may be removed, but `helper` may remain.

## 97.7 Module Identity Can Be Surprising

A module is identified by its import name, not just by its file path.

The same file can be loaded twice under different names.

Example layout:

```text id="97a24"
project/
    pkg/
        __init__.py
        mod.py
```

If code runs with an unusual `sys.path`, the same file may be imported as:

```python id="97a25"
import pkg.mod
```

and also as:

```python id="97a26"
import mod
```

Now there may be two module objects backed by one file.

This causes duplicate globals, duplicate classes, and failed identity checks.

Example symptom:

```python id="97a27"
isinstance(obj, MyClass)
```

may return `False` if `obj` was created from `MyClass` in the duplicate module instance.

## 97.8 Running a Module as `__main__`

When Python executes a file directly:

```bash id="97a28"
python pkg/mod.py
```

the module name is usually:

```text id="97a29"
__main__
```

But when imported:

```python id="97a30"
import pkg.mod
```

the name is:

```text id="97a31"
pkg.mod
```

This can create two module instances:

```text id="97a32"
__main__
pkg.mod
```

A common symptom is duplicated class definitions.

Better command:

```bash id="97a33"
python -m pkg.mod
```

This runs the module using import machinery and preserves package context more correctly.

## 97.9 Relative Import Edge Cases

Relative imports depend on package context.

Example:

```python id="97a34"
from .utils import parse
```

This requires the current module to know its package.

If a file inside a package is executed directly, relative imports may fail because CPython does not treat it as part of the package in the same way.

Bad:

```bash id="97a35"
python pkg/tool.py
```

Better:

```bash id="97a36"
python -m pkg.tool
```

The `-m` form makes CPython locate the module through the import system.

## 97.10 Packages and `__path__`

A package is a module with a `__path__`.

```python id="97a37"
import email

print(email.__path__)
```

The `__path__` tells import machinery where to search for submodules.

For:

```python id="97a38"
import pkg.sub
```

CPython searches inside `pkg.__path__`, not just `sys.path`.

This distinction explains why packages can control submodule discovery.

Custom packages may even modify `__path__` dynamically.

## 97.11 Namespace Packages

Namespace packages allow one logical package to span multiple directories.

Example:

```text id="97a39"
/site1/plugins/foo.py
/site2/plugins/bar.py
```

Both directories may contribute to package `plugins`.

A namespace package may have no `__init__.py`.

This creates edge cases:

```text id="97a40"
package contents depend on sys.path order
different installations contribute different submodules
missing __init__.py is intentional
package identity comes from merged search locations
```

Namespace packages are useful for plugin systems, but they make import resolution less obvious.

## 97.12 Shadowing Standard Library Modules

Imports search paths in order.

A local file can shadow a standard library module.

Example:

```text id="97a41"
project/
    random.py
```

Then:

```python id="97a42"
import random
```

may import the local file instead of the standard library `random`.

This can produce confusing errors.

Example:

```python id="97a43"
# random.py
import random
print(random.randint(1, 10))
```

The local module imports itself and observes a partially initialized module.

## 97.13 `sys.path` Initialization

`sys.path` determines where imports search for top-level modules.

It is affected by:

```text id="97a44"
script location
current working directory
PYTHONPATH
virtual environments
site initialization
.pth files
installation layout
embedded Python configuration
```

This means the same program may import different modules depending on launch mode.

Example:

```bash id="97a45"
python app.py
```

and:

```bash id="97a46"
python -m app
```

can initialize import context differently.

## 97.14 Import Hooks

CPython supports custom import hooks through `sys.meta_path` and `sys.path_hooks`.

`sys.meta_path` contains meta path finders.

```python id="97a47"
import sys

for finder in sys.meta_path:
    print(finder)
```

A meta path finder can intercept imports before normal filesystem lookup.

This enables:

```text id="97a48"
frozen imports
built-in imports
zip imports
custom plugin loaders
remote module systems
test mocking
import instrumentation
```

Import hooks are powerful because they participate in core module resolution.

## 97.15 Meta Path Finder Edge Cases

A broken meta path finder can disrupt every import.

Example:

```python id="97a49"
class BrokenFinder:
    def find_spec(self, fullname, path, target=None):
        raise RuntimeError("broken finder")

import sys
sys.meta_path.insert(0, BrokenFinder())

import json
```

The import fails before normal finders get a chance.

Finders must follow the protocol carefully:

```text id="97a50"
return a spec if handled
return None if not handled
raise only for actual errors
```

Returning `None` means “I do not handle this import.”

## 97.16 Module Specs

Modern import machinery uses `ModuleSpec`.

A spec describes how a module should be loaded.

Important fields include:

```text id="97a51"
name
loader
origin
submodule_search_locations
cached
has_location
```

You can inspect it:

```python id="97a52"
import json

print(json.__spec__)
print(json.__spec__.origin)
print(json.__spec__.loader)
```

The spec is the import system’s plan for a module.

## 97.17 Loaders and Execution

A loader may implement module creation and execution.

Conceptually:

```text id="97a53"
create_module(spec)
exec_module(module)
```

`create_module` may return a custom module object.

`exec_module` initializes it.

This separation lets loaders control module object creation while keeping execution explicit.

## 97.18 Bytecode Cache Files

CPython may store compiled bytecode in `__pycache__`.

Example:

```text id="97a54"
__pycache__/mod.cpython-313.pyc
```

The `.pyc` file avoids recompiling source every time.

It contains:

```text id="97a55"
magic number
cache metadata
marshaled code object
```

Edge cases include:

```text id="97a56"
stale bytecode
hash-based pyc files
timestamp mismatch
read-only filesystems
different optimization levels
version-specific cache tags
```

A `.pyc` file is specific to a CPython bytecode format version.

## 97.19 Source vs Bytecode Loading

CPython may load from source and write bytecode, or load bytecode directly.

If source exists and bytecode cache is valid:

```text id="97a57"
load pyc
execute code object
```

If bytecode is invalid or missing:

```text id="97a58"
read source
compile source
write pyc if allowed
execute code object
```

If source is missing but a suitable bytecode file exists, behavior depends on loader rules and file placement.

## 97.20 Extension Module Imports

Extension modules are native shared libraries.

Examples:

```text id="97a59"
_module.cpython-313-x86_64-linux-gnu.so
_module.pyd
```

Importing an extension module loads native code into the process.

Edge cases include:

```text id="97a60"
ABI mismatch
missing shared library dependency
wrong platform tag
initialization failure
subinterpreter incompatibility
global C state
crashes during import
```

Unlike Python source modules, extension modules can crash the interpreter during import.

## 97.21 Built-in and Frozen Modules

Some modules are built into the interpreter.

Some are frozen, meaning their code is embedded into the CPython binary.

These modules do not require normal filesystem lookup.

They matter during startup because the import system itself needs modules before the full filesystem-based import machinery is ready.

Frozen modules help bootstrap importlib and early runtime initialization.

## 97.22 Reloading Modules

`importlib.reload()` re-executes module code in an existing module object.

```python id="97a61"
import importlib
import config

importlib.reload(config)
```

Reloading does not create a fully clean module by default.

Old names may remain if the new code no longer defines them.

Example:

```python id="97a62"
# first version
x = 1
y = 2
```

After editing to:

```python id="97a63"
x = 10
```

reloading may leave `y` in the module dictionary.

Reload is useful for development, but it is not a full restart.

## 97.23 Import Locks

CPython uses import locks to prevent unsafe concurrent imports.

Without locking, two threads could import and initialize the same module at once.

Conceptually:

```text id="97a64"
Thread A starts importing module M
Thread B starts importing module M
both execute top-level code
```

The import lock prevents duplicate initialization.

However, import locks can interact badly with circular imports and threads if module top-level code waits for other threads that are also importing.

## 97.24 Import-Time Side Effects

Import executes top-level code.

Example:

```python id="97a65"
# app.py
print("starting")
connect_to_database()
register_handlers()
```

Importing this module performs those effects immediately.

This creates problems:

```text id="97a66"
slow imports
network access during import
test fragility
circular import failures
hidden global state
bad startup behavior
```

A safer pattern keeps top-level code limited to definitions:

```python id="97a67"
def main():
    connect_to_database()
    register_handlers()

if __name__ == "__main__":
    main()
```

## 97.25 Lazy Imports

Lazy imports delay module loading until a name is actually used.

They can improve startup time, but introduce edge cases:

```text id="97a68"
errors appear later
import timing changes
side effects move
debugging becomes harder
circular imports change shape
```

Lazy loading changes when module code executes, which can affect programs that rely on import-time registration.

## 97.26 Import and Subinterpreters

Subinterpreters complicate imports.

Each interpreter should have separate module state:

```text id="97a69"
interpreter A imports module M
interpreter B imports module M
```

These imports may create separate module objects.

Extension modules must be careful because process-global C state can accidentally leak across interpreters.

Subinterpreter-safe modules should use per-module state instead of static global state.

## 97.27 Practical Rules

Use these rules to avoid most import edge cases:

```text id="97a70"
avoid circular imports
prefer absolute imports inside packages
run package modules with python -m
avoid naming files after standard library modules
keep top-level module code cheap
avoid global mutable initialization during import
prefer local imports only to break cycles or reduce startup cost
design extension modules with per-module state
treat reload as partial re-execution, not a clean reset
```

## 97.28 Mental Model

Use this model:

```text id="97a71"
Import first checks sys.modules.

If absent:
    find a ModuleSpec
    create or obtain a module object
    insert it into sys.modules
    execute module code
    return the module

A module can exist before it is fully initialized.

The same file can become different modules if imported under different names.

Packages search through __path__.

Import hooks can replace normal resolution.

Extension modules load native code and can break process safety.

Subinterpreters require module state isolation.
```

## 97.29 Chapter Summary

The CPython import system is a runtime protocol, not a simple file loader.

Most edge cases come from a few core facts:

```text id="97a72"
modules are cached in sys.modules
modules are inserted before execution completes
imports execute top-level code
module identity depends on import name
packages search through __path__
custom hooks can alter resolution
extension modules carry native runtime risks
```

Understanding these details explains circular import failures, duplicated modules, relative import errors, standard library shadowing, reload surprises, and subinterpreter complications.
