47. Threads

A thread is an operating system execution context inside a process. CPython exposes threads through the threading module and implements lower-level support through _thread, platform thread APIs, interpreter thread state, locks, condition variables, and GIL coordination.

Threads let a Python program perform multiple activities concurrently inside one process.

In traditional CPython, threads do not usually execute Python bytecode in parallel because of the GIL. They are still useful for I/O-bound programs, background work, timers, blocking library calls, and coordinating native code that releases the GIL.

47.1 Process vs Thread

A process owns an address space.

A thread runs inside a process and shares that address space with other threads.

process
    memory space
    open files
    sockets
    environment
    Python runtime
    thread 1
    thread 2
    thread 3

Threads share Python objects by default. If two threads can reach the same list, dictionary, file object, socket, or class instance, they can both operate on it.

This shared-memory model is convenient, but it requires synchronization.

47.2 Creating Threads

The usual API is threading.Thread.

import threading

def worker():
    print("running in worker")

t = threading.Thread(target=worker)
t.start()
t.join()

start() asks the operating system to begin a new thread.

join() waits for the thread to finish.

The target function runs in the new thread.

main thread
    creates Thread object
    calls start
        OS thread begins
            calls target function
    calls join
        waits for target to finish

47.3 The Main Thread

The main thread is the thread that starts the Python program.

It has special responsibilities:

starts program execution
usually owns top-level application lifecycle
receives KeyboardInterrupt in normal programs
runs signal handlers
often starts worker threads
often coordinates shutdown

Signals are especially important. Python signal handlers run in the main thread at safe interpreter checkpoints.

A worker thread cannot normally receive and execute Python signal handlers directly.

47.4 `_thread` and `threading`

CPython has a low-level _thread module and a higher-level threading module.

Layer	Role
`_thread`	Low-level primitive threads and locks
`threading`	Higher-level `Thread`, `Lock`, `Condition`, `Event`, `Timer`, local storage

Most application code should use threading.

The _thread module is closer to CPython’s primitive thread support and exists mainly as an implementation layer.

47.5 Thread Lifecycle

A thread moves through a simple lifecycle.

created
    Thread object exists, OS thread not running

started
    start() called, OS thread created

running
    target function executing

finished
    target returned or raised

joined
    another thread waited for completion

Example:

import threading
import time

def worker():
    time.sleep(1)

t = threading.Thread(target=worker)

print(t.is_alive())
t.start()
print(t.is_alive())
t.join()
print(t.is_alive())

A Thread object can be started only once.

47.6 Thread Identity

Each running thread has an identity.

import threading

def worker():
    print(threading.current_thread())
    print(threading.get_ident())
    print(threading.get_native_id())

t = threading.Thread(target=worker, name="worker-1")
t.start()
t.join()

get_ident() returns a Python-level thread identifier.

get_native_id() returns the native operating system thread ID when available.

Thread names are useful for logs and debugging.

threading.current_thread().name

47.7 The GIL and Threads

In traditional CPython, a thread must hold the GIL to execute Python bytecode.

This means two Python threads in the same interpreter normally do not execute Python bytecode at the exact same time.

thread A holds GIL
    executes Python bytecode

thread B waits
    cannot execute Python bytecode yet

thread A releases or yields GIL

thread B acquires GIL
    executes Python bytecode

This does not make threads useless.

Threads can still overlap when one thread waits for I/O or when native code releases the GIL.

47.8 I/O-Bound Threading

Threads are useful when tasks spend time waiting.

Example:

import threading
import time

def fetch(i):
    time.sleep(1)
    print("done", i)

threads = [threading.Thread(target=fetch, args=(i,)) for i in range(10)]

for t in threads:
    t.start()

for t in threads:
    t.join()

This finishes much closer to one second than ten seconds because sleeping releases execution to other threads.

The same principle applies to many blocking I/O operations:

network reads
network writes
file operations
database waits
subprocess waits
blocking queues

47.9 CPU-Bound Threading

Threads are usually a poor fit for pure Python CPU-bound work.

def compute():
    total = 0
    for i in range(50_000_000):
        total += i
    return total

Running this function in several Python threads usually does not scale across cores in traditional CPython.

Better options:

multiprocessing
native extension code that releases the GIL
NumPy or other native libraries
external workers
free-threaded CPython builds where appropriate

The right model depends on the workload and data movement cost.

47.10 Shared Mutable State

Threads share memory. That means shared objects need care.

items = []

def worker():
    items.append(1)

A single append may appear safe in common CPython builds, but larger operations are not automatically safe.

if key not in cache:
    cache[key] = compute()

Two threads can both observe the missing key and both compute the value.

Use locks when protecting shared invariants.

47.11 Locks

A lock protects a critical section.

import threading

lock = threading.Lock()
counter = 0

def increment():
    global counter

    with lock:
        counter += 1

Only one thread can hold the lock at a time.

The with statement is preferred because it releases the lock even if an exception occurs.

with lock:
    update_shared_state()

This is equivalent to:

lock.acquire()
try:
    update_shared_state()
finally:
    lock.release()

47.12 Critical Sections

A critical section is code that must not run concurrently with itself or with related code.

Example:

with lock:
    if key not in cache:
        cache[key] = compute_value(key)
    return cache[key]

The protected invariant is:

cache contains at most one computed value per key

Without the lock, two threads might compute and store the same key concurrently.

Design critical sections to be small, but not so small that they fail to protect the invariant.

47.13 Reentrant Locks

A normal lock cannot be acquired twice by the same thread.

lock = threading.Lock()

with lock:
    with lock:
        pass

This deadlocks.

A reentrant lock, RLock, can be acquired multiple times by the owning thread.

import threading

lock = threading.RLock()

with lock:
    with lock:
        pass

RLock is useful when public methods call other public methods that use the same lock.

class Store:
    def __init__(self):
        self._lock = threading.RLock()
        self._items = {}

    def get_or_create(self, key):
        with self._lock:
            if key not in self._items:
                self._items[key] = self.create(key)
            return self._items[key]

    def create(self, key):
        with self._lock:
            return object()

Use RLock when reentrancy is intentional. A normal Lock is simpler and often better.

47.14 Condition Variables

A condition variable lets threads wait until some state becomes true.

import threading

condition = threading.Condition()
items = []

def consumer():
    with condition:
        while not items:
            condition.wait()
        item = items.pop()
        return item

def producer(item):
    with condition:
        items.append(item)
        condition.notify()

The condition combines:

a lock
a wait operation
a notify operation

Always wait in a loop.

while not condition_is_true():
    condition.wait()

Threads can wake up even when the condition they need is not satisfied.

47.15 Events

An event is a simple flag shared between threads.

import threading

ready = threading.Event()

def worker():
    ready.wait()
    print("started")

t = threading.Thread(target=worker)
t.start()

ready.set()
t.join()

An event is useful for one-way coordination:

start signal
shutdown signal
configuration loaded
background service ready
test synchronization

Check without blocking:

if ready.is_set():
    ...

47.16 Semaphores

A semaphore limits access to a finite resource.

import threading

sem = threading.Semaphore(3)

def worker():
    with sem:
        use_limited_resource()

At most three threads can be inside the protected section.

Useful cases:

limit concurrent network calls
limit open files
limit database connections
limit access to a device

A bounded semaphore can detect too many releases.

sem = threading.BoundedSemaphore(3)

47.17 Barriers

A barrier lets a fixed number of threads wait until all have reached the same point.

import threading

barrier = threading.Barrier(3)

def worker(i):
    prepare(i)
    barrier.wait()
    run_phase_two(i)

A barrier is useful for phased algorithms and tests.

If one thread fails to reach the barrier, other threads may block or receive a broken barrier error.

47.18 Queues

queue.Queue is one of the safest ways to coordinate threads.

import queue
import threading

q = queue.Queue()

def producer():
    for i in range(10):
        q.put(i)
    q.put(None)

def consumer():
    while True:
        item = q.get()
        try:
            if item is None:
                return
            process(item)
        finally:
            q.task_done()

A queue handles locking internally.

It gives a clean producer-consumer model:

producer threads put work into queue
consumer threads take work from queue
queue coordinates waiting and wakeup

47.19 Worker Pool Pattern

A simple worker pool:

import queue
import threading

def worker(q):
    while True:
        item = q.get()
        try:
            if item is None:
                return
            process(item)
        finally:
            q.task_done()

q = queue.Queue()
threads = [threading.Thread(target=worker, args=(q,)) for _ in range(4)]

for t in threads:
    t.start()

for item in range(100):
    q.put(item)

q.join()

for _ in threads:
    q.put(None)

for t in threads:
    t.join()

This pattern gives bounded, controlled concurrency.

47.20 `ThreadPoolExecutor`

The higher-level API is concurrent.futures.ThreadPoolExecutor.

from concurrent.futures import ThreadPoolExecutor

def fetch(url):
    return read_url(url)

with ThreadPoolExecutor(max_workers=10) as pool:
    results = list(pool.map(fetch, urls))

This avoids manual thread creation and queue management.

Use it for:

parallel I/O
blocking calls
small worker pools
simple fan-out/fan-in concurrency

Avoid using it blindly for pure Python CPU-bound work.

47.21 Futures

A future represents a result that may not be ready yet.

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=4) as pool:
    future = pool.submit(compute, 10)

    result = future.result()

A future can:

return a result
raise the worker exception
be cancelled if not started
report completion status

Worker exceptions are re-raised when result() is called.

47.22 Daemon Threads

A daemon thread does not prevent the process from exiting.

t = threading.Thread(target=background, daemon=True)
t.start()

When only daemon threads remain, CPython can exit.

Daemon threads may be stopped abruptly during interpreter shutdown. They may not finish cleanup, flush logs, release resources, or close files cleanly.

Use daemon threads for best-effort background work only.

For important work, use non-daemon threads and explicit shutdown.

47.23 Thread Shutdown

A good threaded program has a shutdown protocol.

Common pieces:

shutdown event
work queue sentinel
timeout on blocking operations
join with clear ownership
exception reporting
resource cleanup

Example:

stop = threading.Event()

def worker():
    while not stop.is_set():
        do_one_unit_of_work()

t = threading.Thread(target=worker)
t.start()

stop.set()
t.join()

Python does not provide a safe general way to kill a thread from the outside.

Design threads to stop cooperatively.

47.24 Exceptions in Threads

An exception in a thread does not automatically stop the main thread.

import threading

def worker():
    raise RuntimeError("failed")

t = threading.Thread(target=worker)
t.start()
t.join()

print("main continues")

The exception is printed by the threading machinery, but the main thread continues unless you propagate the failure yourself.

With ThreadPoolExecutor, exceptions are captured in futures and re-raised on result().

future = pool.submit(worker)
future.result()

47.25 Thread-Local Storage

Thread-local storage gives each thread its own value.

import threading

local = threading.local()

def worker(name):
    local.name = name
    print(local.name)

Each thread sees a separate local.__dict__.

Useful cases:

request context
database session handle
trace ID
temporary per-thread cache

Use thread-local storage carefully. It can hide dependencies and complicate async code, where contextvars is often a better fit.

47.26 `contextvars` vs Thread Locals

threading.local() attaches state to OS threads.

contextvars attaches state to logical execution context.

For async code, use contextvars.

For thread-specific data in threaded code, threading.local() can be appropriate.

Mechanism	Scope
`threading.local()`	OS thread
`contextvars.ContextVar`	logical context, async-task friendly

Thread locals do not automatically model async task boundaries.

47.27 Thread State in CPython

At the C level, each Python-executing thread has a thread state.

The thread state records:

current interpreter
current frame
exception state
recursion depth
tracing state
profiling state
context state
async exception state

Many C API functions assume there is a current thread state.

The GIL and thread state are linked: a thread that runs Python code needs an appropriate thread state and must hold the GIL in traditional CPython.

47.28 Native Threads Calling Python

Native code can create threads outside Python. If those threads need to call Python APIs, they must attach to the interpreter and acquire the GIL.

Typical C API shape:

PyGILState_STATE state = PyGILState_Ensure();

/* use Python C API */

PyGILState_Release(state);

This is common in native libraries that call Python callbacks from worker threads.

Rules:

acquire the GIL before touching Python objects
ensure a valid thread state exists
release the GIL when done
avoid calling back during interpreter shutdown

47.29 Thread Safety of Python Objects

Built-in containers protect their internal memory safety under the GIL in traditional CPython.

That does not make sequences of operations logically safe.

Unsafe:

if key not in d:
    d[key] = []
d[key].append(value)

Another thread can interleave between the check and the write.

Safer:

with lock:
    if key not in d:
        d[key] = []
    d[key].append(value)

Think in terms of invariants, not individual bytecodes.

47.30 Deadlocks

A deadlock occurs when threads wait forever for each other.

Example:

# Thread 1
with lock_a:
    with lock_b:
        ...

# Thread 2
with lock_b:
    with lock_a:
        ...

Thread 1 holds lock_a and waits for lock_b.

Thread 2 holds lock_b and waits for lock_a.

Fix by using a consistent lock order.

all code must acquire lock_a before lock_b

Or reduce the number of locks.

47.31 Lock Granularity

Coarse locks protect large regions.

Fine-grained locks protect smaller regions.

Style	Benefit	Cost
Coarse lock	Simple correctness	More contention
Fine-grained locks	More concurrency	More deadlock risk and complexity

Start with simple locking. Optimize only when measurement shows contention matters.

Thread bugs are expensive. Simplicity has value.

47.32 Race Conditions

A race condition occurs when behavior depends on timing.

Example:

ready = False
data = None

def producer():
    global ready, data
    data = load()
    ready = True

def consumer():
    if ready:
        use(data)

The consumer may run before the producer sets ready.

Use proper synchronization:

ready = threading.Event()
data = None

def producer():
    global data
    data = load()
    ready.set()

def consumer():
    ready.wait()
    use(data)

Synchronization should express the dependency directly.

47.33 Memory Visibility

In threaded programs, one thread must know when another thread’s writes are visible and meaningful.

Python synchronization primitives provide this ordering at the application level.

Use:

Lock
Event
Condition
Queue
Future
join

Avoid using sleep as synchronization.

Bad:

time.sleep(0.1)
use(shared_data)

Good:

ready.wait()
use(shared_data)

Sleep makes timing assumptions. Synchronization encodes state.

47.34 Threads and Imports

Imports are synchronized by import locks.

If several threads import the same module, only one should execute its module body. Others wait or receive the cached module.

This matters because imports execute code and mutate sys.modules.

Avoid starting threads at import time.

Bad:

# module top level
threading.Thread(target=worker).start()

Better:

def start_worker():
    threading.Thread(target=worker).start()

Let application startup control thread creation.

47.35 Threads and Interpreter Shutdown

Interpreter shutdown is a difficult phase.

At shutdown:

modules may be partially cleared
daemon threads may still be running
locks may be held
standard streams may be closing
imports may fail
globals may become None

Avoid relying on daemon threads for cleanup.

Avoid complex __del__ methods that interact with threads.

Use explicit shutdown functions and join worker threads before process exit.

47.36 Threads and Finalizers

Finalizers can run in whichever thread causes the last reference to disappear.

class Resource:
    def __del__(self):
        cleanup()

If a worker thread drops the last reference, cleanup may run in that worker thread.

This can be surprising when cleanup touches thread-affine resources.

Prefer context managers:

with resource:
    use(resource)

or explicit close methods:

resource.close()

47.37 Threads and Signals

Python signal handlers run in the main thread.

This means a worker thread cannot rely on receiving KeyboardInterrupt directly.

A common shutdown pattern:

stop = threading.Event()

try:
    run_main_loop()
except KeyboardInterrupt:
    stop.set()
    join_workers()

The main thread receives the interrupt and signals workers to stop.

47.38 Threads and Asyncio

Threads and asyncio can interact, but they are different concurrency models.

asyncio uses cooperative tasks inside an event loop.

Threads use OS scheduling.

To run blocking code from async code:

import asyncio

result = await asyncio.to_thread(blocking_function, arg)

To call an event loop from another thread, use thread-safe APIs such as:

loop.call_soon_threadsafe(callback)

Do not directly mutate event-loop-owned state from arbitrary threads.

47.39 Threads and Multiprocessing

Threads share memory inside one process.

Processes have separate memory spaces.

Model	Memory	CPU parallelism in traditional CPython
Threads	Shared	Limited for Python bytecode
Processes	Separate	Good
Async tasks	Shared event loop	Single-threaded unless offloaded

Use threads for blocking I/O.

Use processes for pure Python CPU-bound parallelism.

Use async for high-concurrency I/O when libraries support it.

47.40 Threads and C Extensions

C extensions must handle threads carefully.

When using Python objects, extension code must hold the GIL.

When doing long native work, extension code may release the GIL.

Py_BEGIN_ALLOW_THREADS
/* native work without Python objects */
Py_END_ALLOW_THREADS

Extension authors must also protect native shared state with native locks. The GIL should not be treated as a universal lock, especially for free-threaded compatibility.

47.41 Thread Debugging

Useful tools:

threading.enumerate()
threading.current_thread()
faulthandler.dump_traceback()
logging with thread names
timeouts on joins and waits
concurrent.futures futures
profilers

Example:

import threading

for t in threading.enumerate():
    print(t.name, t.ident, t.is_alive())

Dump all thread stacks:

import faulthandler

faulthandler.dump_traceback()

Thread dumps are one of the fastest ways to diagnose deadlocks.

47.42 Logging From Threads

The standard logging module is designed to be usable from multiple threads.

Include thread names in log format:

import logging

logging.basicConfig(
    format="%(asctime)s %(threadName)s %(levelname)s %(message)s",
    level=logging.INFO,
)

Then worker logs can show which thread produced each message.

Avoid writing ad hoc logs to shared files without synchronization.

47.43 Timeouts

Blocking forever is dangerous.

Prefer timeouts when waiting across thread boundaries:

if event.wait(timeout=5):
    proceed()
else:
    handle_timeout()

For joins:

t.join(timeout=5)

if t.is_alive():
    report_stuck_thread()

Timeouts do not fix concurrency bugs, but they make failures observable.

47.44 Design Rule: Own Your Threads

A component that starts a thread should usually provide a way to stop it.

class Worker:
    def __init__(self):
        self._stop = threading.Event()
        self._thread = threading.Thread(target=self._run)

    def start(self):
        self._thread.start()

    def stop(self):
        self._stop.set()
        self._thread.join()

    def _run(self):
        while not self._stop.is_set():
            do_work()

Thread ownership should be explicit.

Hidden background threads are hard to test and hard to shut down.

47.45 Good Threading Patterns

Good patterns:

producer-consumer with Queue
bounded worker pool
main thread coordinates shutdown
explicit locks around shared invariants
events for readiness and shutdown
futures for result propagation
small critical sections
timeouts for external waits

Poor patterns:

starting threads at import time
using sleep for synchronization
shared mutable globals without locks
daemon threads for important work
holding locks while calling unknown code
ignoring worker exceptions
mixing async and threads without clear ownership

47.46 Minimal Threaded Server Shape

A simple threaded server often looks like:

import queue
import threading

class Server:
    def __init__(self, workers=4):
        self._stop = threading.Event()
        self._jobs = queue.Queue()
        self._threads = [
            threading.Thread(target=self._worker, name=f"worker-{i}")
            for i in range(workers)
        ]

    def start(self):
        for thread in self._threads:
            thread.start()

    def submit(self, job):
        self._jobs.put(job)

    def stop(self):
        self._stop.set()

        for _ in self._threads:
            self._jobs.put(None)

        for thread in self._threads:
            thread.join()

    def _worker(self):
        while not self._stop.is_set():
            job = self._jobs.get()
            try:
                if job is None:
                    return
                job()
            finally:
                self._jobs.task_done()

This design makes work delivery, shutdown, and ownership visible.

47.47 CPython Thread Internals

At a high level, CPython thread support involves:

platform thread abstraction
GIL acquisition and release
thread state allocation
current thread state tracking
lock primitives
condition primitives
thread-local storage support
interpreter shutdown coordination

The runtime must coordinate OS-level threads with interpreter-level execution state.

A Python thread is not only an OS thread. It is an OS thread that has been connected to CPython’s interpreter state.

47.48 Free-Threaded Builds and Thread Assumptions

Free-threaded CPython changes the assumptions around threads.

Code that relied on the GIL for implicit safety may need explicit synchronization.

Examples of fragile assumptions:

dictionary compound operations are safe enough
borrowed references remain stable without ownership
C global state is protected by the GIL
container mutation does not need locking

Well-designed threaded Python code already uses application-level locks and queues. Such code is easier to adapt.

47.49 When to Use Threads

Use threads when:

tasks block on I/O
the API you must call is blocking
you need background coordination inside one process
native code releases the GIL
a small worker pool simplifies the design

Avoid threads when:

the workload is pure Python CPU-heavy
shared mutable state dominates the design
shutdown cannot be made explicit
async libraries already solve the problem cleanly
process isolation is required

Threading is a practical tool, not a universal concurrency model.

47.50 Key Points

Threads are OS execution contexts inside one process.

CPython exposes threads through _thread, threading, and concurrent.futures.

Traditional CPython uses the GIL, so pure Python bytecode usually does not run in parallel across threads.

Threads are still useful for I/O-bound work and native code that releases the GIL.

Shared mutable state requires explicit synchronization.

Use locks to protect invariants, queues to move work, events to signal state, and futures to propagate results.

Avoid starting threads at import time.

Design cooperative shutdown.

Daemon threads are unsuitable for important cleanup.

Thread bugs are usually timing bugs. Use explicit synchronization, thread dumps, logging, and timeouts.

47. Threads

47.1 Process vs Thread

47.2 Creating Threads

47.3 The Main Thread

47.4 _thread and threading

47.5 Thread Lifecycle

47.6 Thread Identity

47.7 The GIL and Threads

47.8 I/O-Bound Threading

47.9 CPU-Bound Threading

47.10 Shared Mutable State

47.11 Locks

47.12 Critical Sections

47.13 Reentrant Locks

47.14 Condition Variables

47.15 Events

47.16 Semaphores

47.17 Barriers

47.18 Queues

47.19 Worker Pool Pattern

47.20 ThreadPoolExecutor

47.21 Futures

47.22 Daemon Threads

47.23 Thread Shutdown

47.24 Exceptions in Threads

47.25 Thread-Local Storage

47.26 contextvars vs Thread Locals

47.27 Thread State in CPython

47.28 Native Threads Calling Python

47.29 Thread Safety of Python Objects

47.30 Deadlocks

47.31 Lock Granularity

47.32 Race Conditions

47.33 Memory Visibility

47.34 Threads and Imports

47.35 Threads and Interpreter Shutdown

47.36 Threads and Finalizers

47.37 Threads and Signals

47.38 Threads and Asyncio

47.39 Threads and Multiprocessing

47.40 Threads and C Extensions

47.41 Thread Debugging

47.42 Logging From Threads

47.43 Timeouts

47.44 Design Rule: Own Your Threads

47.45 Good Threading Patterns

47.46 Minimal Threaded Server Shape

47.47 CPython Thread Internals

47.48 Free-Threaded Builds and Thread Assumptions

47.49 When to Use Threads

47.50 Key Points

47.4 `_thread` and `threading`

47.20 `ThreadPoolExecutor`

47.26 `contextvars` vs Thread Locals