97. Import System Edge Cases

The CPython import system is the machinery that finds, loads, initializes, caches, and returns modules. It looks simple at the Python level:

import json

But internally, import is one of CPython’s most complex runtime systems.

It must handle:

source files
bytecode cache files
built-in modules
frozen modules
extension modules
packages
namespace packages
relative imports
import hooks
path hooks
module caching
reloads
partially initialized modules
circular imports
subinterpreters
thread synchronization

This chapter focuses on edge cases. These are the cases that explain why the import system has so many layers.

97.1 Import Is a Runtime Protocol

Import is not just a filesystem operation.

It is a runtime protocol built around finders, specs, loaders, module objects, and caches.

Conceptually:

import name
    ↓
check sys.modules
    ↓
find module spec
    ↓
create module object
    ↓
insert module into sys.modules
    ↓
execute module code
    ↓
bind name in caller

The important point is that module discovery and module execution are separate.

A finder answers:

Can this module be found?
Where is it?
What loader should load it?
Is it a package?

A loader answers:

How should the module object be created?
How should its code be executed?

This separation allows imports from files, zip archives, frozen code, built-ins, networked systems, generated modules, and custom import mechanisms.

97.2 `sys.modules` Is the First Cache

Before CPython searches the filesystem or calls import hooks, it checks sys.modules.

import sys

print(sys.modules["sys"])

sys.modules maps module names to module objects.

If the requested module already exists there, import usually returns it immediately.

This gives import important properties:

modules execute once
future imports reuse the same module object
module globals preserve state
circular imports can terminate

Example:

# config.py
value = 10

import config
config.value = 20

import config
print(config.value)

The second import sees the same module object.

97.3 Partially Initialized Modules

A subtle edge case appears during module execution.

CPython inserts the module into sys.modules before executing the module body.

Conceptually:

create module object
insert into sys.modules
execute module code

This is necessary for circular imports.

But it means other code may observe a partially initialized module.

Example:

# a.py
import b

x = 1

# b.py
import a

print(a.x)

When b imports a, a may already exist in sys.modules, but x may not yet be defined.

This can produce:

AttributeError: partially initialized module ...

The module object exists, but its top-level code has not finished.

97.4 Circular Imports

Circular imports happen when modules import each other directly or indirectly.

a imports b
b imports c
c imports a

Circular imports are allowed, but they are fragile when modules access names too early.

Bad pattern:

# user.py
from service import create_service

class User:
    pass

# service.py
from user import User

def create_service():
    return User()

This may fail depending on execution order.

Safer pattern:

# service.py
def create_service():
    from user import User
    return User()

Moving imports inside functions delays name resolution until both modules have initialized.

97.5 `import module` vs `from module import name`

These two forms behave differently in circular imports.

import a

binds the module object.

from a import x

requires x to exist at import time.

In circular cases, this difference matters.

If module a is only partially initialized, import a may succeed, while from a import x may fail because x has not been assigned yet.

This is why plain module imports are often more robust in tangled module graphs.

97.6 Failed Imports and `sys.modules`

If module execution fails, CPython usually removes the failing module from sys.modules.

Example:

# broken.py
raise RuntimeError("import failed")

try:
    import broken
except RuntimeError:
    pass

import sys
print("broken" in sys.modules)

The expected result is usually:

False

This prevents future imports from reusing a broken partially initialized module.

However, modules imported successfully as side effects may remain in sys.modules.

# broken.py
import helper
raise RuntimeError("import failed")

After failure, broken may be removed, but helper may remain.

97.7 Module Identity Can Be Surprising

A module is identified by its import name, not just by its file path.

The same file can be loaded twice under different names.

Example layout:

project/
    pkg/
        __init__.py
        mod.py

If code runs with an unusual sys.path, the same file may be imported as:

import pkg.mod

and also as:

import mod

Now there may be two module objects backed by one file.

This causes duplicate globals, duplicate classes, and failed identity checks.

Example symptom:

isinstance(obj, MyClass)

may return False if obj was created from MyClass in the duplicate module instance.

97.8 Running a Module as `main`

When Python executes a file directly:

python pkg/mod.py

the module name is usually:

__main__

But when imported:

import pkg.mod

the name is:

pkg.mod

This can create two module instances:

__main__
pkg.mod

A common symptom is duplicated class definitions.

Better command:

python -m pkg.mod

This runs the module using import machinery and preserves package context more correctly.

97.9 Relative Import Edge Cases

Relative imports depend on package context.

Example:

from .utils import parse

This requires the current module to know its package.

If a file inside a package is executed directly, relative imports may fail because CPython does not treat it as part of the package in the same way.

Bad:

python pkg/tool.py

Better:

python -m pkg.tool

The -m form makes CPython locate the module through the import system.

97.10 Packages and `path`

A package is a module with a __path__.

import email

print(email.__path__)

The __path__ tells import machinery where to search for submodules.

For:

import pkg.sub

CPython searches inside pkg.__path__, not just sys.path.

This distinction explains why packages can control submodule discovery.

Custom packages may even modify __path__ dynamically.

97.11 Namespace Packages

Namespace packages allow one logical package to span multiple directories.

Example:

/site1/plugins/foo.py
/site2/plugins/bar.py

Both directories may contribute to package plugins.

A namespace package may have no __init__.py.

This creates edge cases:

package contents depend on sys.path order
different installations contribute different submodules
missing __init__.py is intentional
package identity comes from merged search locations

Namespace packages are useful for plugin systems, but they make import resolution less obvious.

97.12 Shadowing Standard Library Modules

Imports search paths in order.

A local file can shadow a standard library module.

Example:

project/
    random.py

Then:

import random

may import the local file instead of the standard library random.

This can produce confusing errors.

Example:

# random.py
import random
print(random.randint(1, 10))

The local module imports itself and observes a partially initialized module.

97.13 `sys.path` Initialization

sys.path determines where imports search for top-level modules.

It is affected by:

script location
current working directory
PYTHONPATH
virtual environments
site initialization
.pth files
installation layout
embedded Python configuration

This means the same program may import different modules depending on launch mode.

Example:

python app.py

and:

python -m app

can initialize import context differently.

97.14 Import Hooks

CPython supports custom import hooks through sys.meta_path and sys.path_hooks.

sys.meta_path contains meta path finders.

import sys

for finder in sys.meta_path:
    print(finder)

A meta path finder can intercept imports before normal filesystem lookup.

This enables:

frozen imports
built-in imports
zip imports
custom plugin loaders
remote module systems
test mocking
import instrumentation

Import hooks are powerful because they participate in core module resolution.

97.15 Meta Path Finder Edge Cases

A broken meta path finder can disrupt every import.

Example:

class BrokenFinder:
    def find_spec(self, fullname, path, target=None):
        raise RuntimeError("broken finder")

import sys
sys.meta_path.insert(0, BrokenFinder())

import json

The import fails before normal finders get a chance.

Finders must follow the protocol carefully:

return a spec if handled
return None if not handled
raise only for actual errors

Returning None means “I do not handle this import.”

97.16 Module Specs

Modern import machinery uses ModuleSpec.

A spec describes how a module should be loaded.

Important fields include:

name
loader
origin
submodule_search_locations
cached
has_location

You can inspect it:

import json

print(json.__spec__)
print(json.__spec__.origin)
print(json.__spec__.loader)

The spec is the import system’s plan for a module.

97.17 Loaders and Execution

A loader may implement module creation and execution.

Conceptually:

create_module(spec)
exec_module(module)

create_module may return a custom module object.

exec_module initializes it.

This separation lets loaders control module object creation while keeping execution explicit.

97.18 Bytecode Cache Files

CPython may store compiled bytecode in __pycache__.

Example:

__pycache__/mod.cpython-313.pyc

The .pyc file avoids recompiling source every time.

It contains:

magic number
cache metadata
marshaled code object

Edge cases include:

stale bytecode
hash-based pyc files
timestamp mismatch
read-only filesystems
different optimization levels
version-specific cache tags

A .pyc file is specific to a CPython bytecode format version.

97.19 Source vs Bytecode Loading

CPython may load from source and write bytecode, or load bytecode directly.

If source exists and bytecode cache is valid:

load pyc
execute code object

If bytecode is invalid or missing:

read source
compile source
write pyc if allowed
execute code object

If source is missing but a suitable bytecode file exists, behavior depends on loader rules and file placement.

97.20 Extension Module Imports

Extension modules are native shared libraries.

Examples:

_module.cpython-313-x86_64-linux-gnu.so
_module.pyd

Importing an extension module loads native code into the process.

Edge cases include:

ABI mismatch
missing shared library dependency
wrong platform tag
initialization failure
subinterpreter incompatibility
global C state
crashes during import

Unlike Python source modules, extension modules can crash the interpreter during import.

97.21 Built-in and Frozen Modules

Some modules are built into the interpreter.

Some are frozen, meaning their code is embedded into the CPython binary.

These modules do not require normal filesystem lookup.

They matter during startup because the import system itself needs modules before the full filesystem-based import machinery is ready.

Frozen modules help bootstrap importlib and early runtime initialization.

97.22 Reloading Modules

importlib.reload() re-executes module code in an existing module object.

import importlib
import config

importlib.reload(config)

Reloading does not create a fully clean module by default.

Old names may remain if the new code no longer defines them.

Example:

# first version
x = 1
y = 2

After editing to:

x = 10

reloading may leave y in the module dictionary.

Reload is useful for development, but it is not a full restart.

97.23 Import Locks

CPython uses import locks to prevent unsafe concurrent imports.

Without locking, two threads could import and initialize the same module at once.

Conceptually:

Thread A starts importing module M
Thread B starts importing module M
both execute top-level code

The import lock prevents duplicate initialization.

However, import locks can interact badly with circular imports and threads if module top-level code waits for other threads that are also importing.

97.24 Import-Time Side Effects

Import executes top-level code.

Example:

# app.py
print("starting")
connect_to_database()
register_handlers()

Importing this module performs those effects immediately.

This creates problems:

slow imports
network access during import
test fragility
circular import failures
hidden global state
bad startup behavior

A safer pattern keeps top-level code limited to definitions:

def main():
    connect_to_database()
    register_handlers()

if __name__ == "__main__":
    main()

97.25 Lazy Imports

Lazy imports delay module loading until a name is actually used.

They can improve startup time, but introduce edge cases:

errors appear later
import timing changes
side effects move
debugging becomes harder
circular imports change shape

Lazy loading changes when module code executes, which can affect programs that rely on import-time registration.

97.26 Import and Subinterpreters

Subinterpreters complicate imports.

Each interpreter should have separate module state:

interpreter A imports module M
interpreter B imports module M

These imports may create separate module objects.

Extension modules must be careful because process-global C state can accidentally leak across interpreters.

Subinterpreter-safe modules should use per-module state instead of static global state.

97.27 Practical Rules

Use these rules to avoid most import edge cases:

avoid circular imports
prefer absolute imports inside packages
run package modules with python -m
avoid naming files after standard library modules
keep top-level module code cheap
avoid global mutable initialization during import
prefer local imports only to break cycles or reduce startup cost
design extension modules with per-module state
treat reload as partial re-execution, not a clean reset

97.28 Mental Model

Use this model:

Import first checks sys.modules.

If absent:
    find a ModuleSpec
    create or obtain a module object
    insert it into sys.modules
    execute module code
    return the module

A module can exist before it is fully initialized.

The same file can become different modules if imported under different names.

Packages search through __path__.

Import hooks can replace normal resolution.

Extension modules load native code and can break process safety.

Subinterpreters require module state isolation.

97.29 Chapter Summary

The CPython import system is a runtime protocol, not a simple file loader.

Most edge cases come from a few core facts:

modules are cached in sys.modules
modules are inserted before execution completes
imports execute top-level code
module identity depends on import name
packages search through __path__
custom hooks can alter resolution
extension modules carry native runtime risks

Understanding these details explains circular import failures, duplicated modules, relative import errors, standard library shadowing, reload surprises, and subinterpreter complications.

97. Import System Edge Cases

97.1 Import Is a Runtime Protocol

97.2 sys.modules Is the First Cache

97.3 Partially Initialized Modules

97.4 Circular Imports

97.5 import module vs from module import name

97.6 Failed Imports and sys.modules

97.7 Module Identity Can Be Surprising

97.8 Running a Module as __main__

97.9 Relative Import Edge Cases

97.10 Packages and __path__

97.11 Namespace Packages

97.12 Shadowing Standard Library Modules

97.13 sys.path Initialization

97.14 Import Hooks

97.15 Meta Path Finder Edge Cases

97.16 Module Specs

97.17 Loaders and Execution

97.18 Bytecode Cache Files

97.19 Source vs Bytecode Loading

97.20 Extension Module Imports

97.21 Built-in and Frozen Modules

97.22 Reloading Modules

97.23 Import Locks

97.24 Import-Time Side Effects

97.25 Lazy Imports

97.26 Import and Subinterpreters

97.27 Practical Rules

97.28 Mental Model

97.29 Chapter Summary

97.2 `sys.modules` Is the First Cache

97.5 `import module` vs `from module import name`

97.6 Failed Imports and `sys.modules`

97.8 Running a Module as `main`

97.10 Packages and `path`

97.13 `sys.path` Initialization