Python thread objects, OS thread mapping, the GIL acquisition protocol, and thread-local state management.
A thread is an operating system execution context inside a process. CPython exposes threads through the threading module and implements lower-level support through _thread, platform thread APIs, interpreter thread state, locks, condition variables, and GIL coordination.
Threads let a Python program perform multiple activities concurrently inside one process.
In traditional CPython, threads do not usually execute Python bytecode in parallel because of the GIL. They are still useful for I/O-bound programs, background work, timers, blocking library calls, and coordinating native code that releases the GIL.
47.1 Process vs Thread
A process owns an address space.
A thread runs inside a process and shares that address space with other threads.
process
memory space
open files
sockets
environment
Python runtime
thread 1
thread 2
thread 3Threads share Python objects by default. If two threads can reach the same list, dictionary, file object, socket, or class instance, they can both operate on it.
This shared-memory model is convenient, but it requires synchronization.
47.2 Creating Threads
The usual API is threading.Thread.
import threading
def worker():
print("running in worker")
t = threading.Thread(target=worker)
t.start()
t.join()start() asks the operating system to begin a new thread.
join() waits for the thread to finish.
The target function runs in the new thread.
main thread
creates Thread object
calls start
OS thread begins
calls target function
calls join
waits for target to finish47.3 The Main Thread
The main thread is the thread that starts the Python program.
It has special responsibilities:
starts program execution
usually owns top-level application lifecycle
receives KeyboardInterrupt in normal programs
runs signal handlers
often starts worker threads
often coordinates shutdownSignals are especially important. Python signal handlers run in the main thread at safe interpreter checkpoints.
A worker thread cannot normally receive and execute Python signal handlers directly.
47.4 _thread and threading
CPython has a low-level _thread module and a higher-level threading module.
| Layer | Role |
|---|---|
_thread | Low-level primitive threads and locks |
threading | Higher-level Thread, Lock, Condition, Event, Timer, local storage |
Most application code should use threading.
The _thread module is closer to CPython’s primitive thread support and exists mainly as an implementation layer.
47.5 Thread Lifecycle
A thread moves through a simple lifecycle.
created
Thread object exists, OS thread not running
started
start() called, OS thread created
running
target function executing
finished
target returned or raised
joined
another thread waited for completionExample:
import threading
import time
def worker():
time.sleep(1)
t = threading.Thread(target=worker)
print(t.is_alive())
t.start()
print(t.is_alive())
t.join()
print(t.is_alive())A Thread object can be started only once.
47.6 Thread Identity
Each running thread has an identity.
import threading
def worker():
print(threading.current_thread())
print(threading.get_ident())
print(threading.get_native_id())
t = threading.Thread(target=worker, name="worker-1")
t.start()
t.join()get_ident() returns a Python-level thread identifier.
get_native_id() returns the native operating system thread ID when available.
Thread names are useful for logs and debugging.
threading.current_thread().name47.7 The GIL and Threads
In traditional CPython, a thread must hold the GIL to execute Python bytecode.
This means two Python threads in the same interpreter normally do not execute Python bytecode at the exact same time.
thread A holds GIL
executes Python bytecode
thread B waits
cannot execute Python bytecode yet
thread A releases or yields GIL
thread B acquires GIL
executes Python bytecodeThis does not make threads useless.
Threads can still overlap when one thread waits for I/O or when native code releases the GIL.
47.8 I/O-Bound Threading
Threads are useful when tasks spend time waiting.
Example:
import threading
import time
def fetch(i):
time.sleep(1)
print("done", i)
threads = [threading.Thread(target=fetch, args=(i,)) for i in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()This finishes much closer to one second than ten seconds because sleeping releases execution to other threads.
The same principle applies to many blocking I/O operations:
network reads
network writes
file operations
database waits
subprocess waits
blocking queues47.9 CPU-Bound Threading
Threads are usually a poor fit for pure Python CPU-bound work.
def compute():
total = 0
for i in range(50_000_000):
total += i
return totalRunning this function in several Python threads usually does not scale across cores in traditional CPython.
Better options:
multiprocessing
native extension code that releases the GIL
NumPy or other native libraries
external workers
free-threaded CPython builds where appropriateThe right model depends on the workload and data movement cost.
47.10 Shared Mutable State
Threads share memory. That means shared objects need care.
items = []
def worker():
items.append(1)A single append may appear safe in common CPython builds, but larger operations are not automatically safe.
if key not in cache:
cache[key] = compute()Two threads can both observe the missing key and both compute the value.
Use locks when protecting shared invariants.
47.11 Locks
A lock protects a critical section.
import threading
lock = threading.Lock()
counter = 0
def increment():
global counter
with lock:
counter += 1Only one thread can hold the lock at a time.
The with statement is preferred because it releases the lock even if an exception occurs.
with lock:
update_shared_state()This is equivalent to:
lock.acquire()
try:
update_shared_state()
finally:
lock.release()47.12 Critical Sections
A critical section is code that must not run concurrently with itself or with related code.
Example:
with lock:
if key not in cache:
cache[key] = compute_value(key)
return cache[key]The protected invariant is:
cache contains at most one computed value per keyWithout the lock, two threads might compute and store the same key concurrently.
Design critical sections to be small, but not so small that they fail to protect the invariant.
47.13 Reentrant Locks
A normal lock cannot be acquired twice by the same thread.
lock = threading.Lock()
with lock:
with lock:
passThis deadlocks.
A reentrant lock, RLock, can be acquired multiple times by the owning thread.
import threading
lock = threading.RLock()
with lock:
with lock:
passRLock is useful when public methods call other public methods that use the same lock.
class Store:
def __init__(self):
self._lock = threading.RLock()
self._items = {}
def get_or_create(self, key):
with self._lock:
if key not in self._items:
self._items[key] = self.create(key)
return self._items[key]
def create(self, key):
with self._lock:
return object()Use RLock when reentrancy is intentional. A normal Lock is simpler and often better.
47.14 Condition Variables
A condition variable lets threads wait until some state becomes true.
import threading
condition = threading.Condition()
items = []
def consumer():
with condition:
while not items:
condition.wait()
item = items.pop()
return item
def producer(item):
with condition:
items.append(item)
condition.notify()The condition combines:
a lock
a wait operation
a notify operationAlways wait in a loop.
while not condition_is_true():
condition.wait()Threads can wake up even when the condition they need is not satisfied.
47.15 Events
An event is a simple flag shared between threads.
import threading
ready = threading.Event()
def worker():
ready.wait()
print("started")
t = threading.Thread(target=worker)
t.start()
ready.set()
t.join()An event is useful for one-way coordination:
start signal
shutdown signal
configuration loaded
background service ready
test synchronizationCheck without blocking:
if ready.is_set():
...47.16 Semaphores
A semaphore limits access to a finite resource.
import threading
sem = threading.Semaphore(3)
def worker():
with sem:
use_limited_resource()At most three threads can be inside the protected section.
Useful cases:
limit concurrent network calls
limit open files
limit database connections
limit access to a deviceA bounded semaphore can detect too many releases.
sem = threading.BoundedSemaphore(3)47.17 Barriers
A barrier lets a fixed number of threads wait until all have reached the same point.
import threading
barrier = threading.Barrier(3)
def worker(i):
prepare(i)
barrier.wait()
run_phase_two(i)A barrier is useful for phased algorithms and tests.
If one thread fails to reach the barrier, other threads may block or receive a broken barrier error.
47.18 Queues
queue.Queue is one of the safest ways to coordinate threads.
import queue
import threading
q = queue.Queue()
def producer():
for i in range(10):
q.put(i)
q.put(None)
def consumer():
while True:
item = q.get()
try:
if item is None:
return
process(item)
finally:
q.task_done()A queue handles locking internally.
It gives a clean producer-consumer model:
producer threads put work into queue
consumer threads take work from queue
queue coordinates waiting and wakeup47.19 Worker Pool Pattern
A simple worker pool:
import queue
import threading
def worker(q):
while True:
item = q.get()
try:
if item is None:
return
process(item)
finally:
q.task_done()
q = queue.Queue()
threads = [threading.Thread(target=worker, args=(q,)) for _ in range(4)]
for t in threads:
t.start()
for item in range(100):
q.put(item)
q.join()
for _ in threads:
q.put(None)
for t in threads:
t.join()This pattern gives bounded, controlled concurrency.
47.20 ThreadPoolExecutor
The higher-level API is concurrent.futures.ThreadPoolExecutor.
from concurrent.futures import ThreadPoolExecutor
def fetch(url):
return read_url(url)
with ThreadPoolExecutor(max_workers=10) as pool:
results = list(pool.map(fetch, urls))This avoids manual thread creation and queue management.
Use it for:
parallel I/O
blocking calls
small worker pools
simple fan-out/fan-in concurrencyAvoid using it blindly for pure Python CPU-bound work.
47.21 Futures
A future represents a result that may not be ready yet.
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=4) as pool:
future = pool.submit(compute, 10)
result = future.result()A future can:
return a result
raise the worker exception
be cancelled if not started
report completion statusWorker exceptions are re-raised when result() is called.
47.22 Daemon Threads
A daemon thread does not prevent the process from exiting.
t = threading.Thread(target=background, daemon=True)
t.start()When only daemon threads remain, CPython can exit.
Daemon threads may be stopped abruptly during interpreter shutdown. They may not finish cleanup, flush logs, release resources, or close files cleanly.
Use daemon threads for best-effort background work only.
For important work, use non-daemon threads and explicit shutdown.
47.23 Thread Shutdown
A good threaded program has a shutdown protocol.
Common pieces:
shutdown event
work queue sentinel
timeout on blocking operations
join with clear ownership
exception reporting
resource cleanupExample:
stop = threading.Event()
def worker():
while not stop.is_set():
do_one_unit_of_work()
t = threading.Thread(target=worker)
t.start()
stop.set()
t.join()Python does not provide a safe general way to kill a thread from the outside.
Design threads to stop cooperatively.
47.24 Exceptions in Threads
An exception in a thread does not automatically stop the main thread.
import threading
def worker():
raise RuntimeError("failed")
t = threading.Thread(target=worker)
t.start()
t.join()
print("main continues")The exception is printed by the threading machinery, but the main thread continues unless you propagate the failure yourself.
With ThreadPoolExecutor, exceptions are captured in futures and re-raised on result().
future = pool.submit(worker)
future.result()47.25 Thread-Local Storage
Thread-local storage gives each thread its own value.
import threading
local = threading.local()
def worker(name):
local.name = name
print(local.name)Each thread sees a separate local.__dict__.
Useful cases:
request context
database session handle
trace ID
temporary per-thread cacheUse thread-local storage carefully. It can hide dependencies and complicate async code, where contextvars is often a better fit.
47.26 contextvars vs Thread Locals
threading.local() attaches state to OS threads.
contextvars attaches state to logical execution context.
For async code, use contextvars.
For thread-specific data in threaded code, threading.local() can be appropriate.
| Mechanism | Scope |
|---|---|
threading.local() | OS thread |
contextvars.ContextVar | logical context, async-task friendly |
Thread locals do not automatically model async task boundaries.
47.27 Thread State in CPython
At the C level, each Python-executing thread has a thread state.
The thread state records:
current interpreter
current frame
exception state
recursion depth
tracing state
profiling state
context state
async exception stateMany C API functions assume there is a current thread state.
The GIL and thread state are linked: a thread that runs Python code needs an appropriate thread state and must hold the GIL in traditional CPython.
47.28 Native Threads Calling Python
Native code can create threads outside Python. If those threads need to call Python APIs, they must attach to the interpreter and acquire the GIL.
Typical C API shape:
PyGILState_STATE state = PyGILState_Ensure();
/* use Python C API */
PyGILState_Release(state);This is common in native libraries that call Python callbacks from worker threads.
Rules:
acquire the GIL before touching Python objects
ensure a valid thread state exists
release the GIL when done
avoid calling back during interpreter shutdown47.29 Thread Safety of Python Objects
Built-in containers protect their internal memory safety under the GIL in traditional CPython.
That does not make sequences of operations logically safe.
Unsafe:
if key not in d:
d[key] = []
d[key].append(value)Another thread can interleave between the check and the write.
Safer:
with lock:
if key not in d:
d[key] = []
d[key].append(value)Think in terms of invariants, not individual bytecodes.
47.30 Deadlocks
A deadlock occurs when threads wait forever for each other.
Example:
# Thread 1
with lock_a:
with lock_b:
...
# Thread 2
with lock_b:
with lock_a:
...Thread 1 holds lock_a and waits for lock_b.
Thread 2 holds lock_b and waits for lock_a.
Fix by using a consistent lock order.
all code must acquire lock_a before lock_bOr reduce the number of locks.
47.31 Lock Granularity
Coarse locks protect large regions.
Fine-grained locks protect smaller regions.
| Style | Benefit | Cost |
|---|---|---|
| Coarse lock | Simple correctness | More contention |
| Fine-grained locks | More concurrency | More deadlock risk and complexity |
Start with simple locking. Optimize only when measurement shows contention matters.
Thread bugs are expensive. Simplicity has value.
47.32 Race Conditions
A race condition occurs when behavior depends on timing.
Example:
ready = False
data = None
def producer():
global ready, data
data = load()
ready = True
def consumer():
if ready:
use(data)The consumer may run before the producer sets ready.
Use proper synchronization:
ready = threading.Event()
data = None
def producer():
global data
data = load()
ready.set()
def consumer():
ready.wait()
use(data)Synchronization should express the dependency directly.
47.33 Memory Visibility
In threaded programs, one thread must know when another thread’s writes are visible and meaningful.
Python synchronization primitives provide this ordering at the application level.
Use:
Lock
Event
Condition
Queue
Future
joinAvoid using sleep as synchronization.
Bad:
time.sleep(0.1)
use(shared_data)Good:
ready.wait()
use(shared_data)Sleep makes timing assumptions. Synchronization encodes state.
47.34 Threads and Imports
Imports are synchronized by import locks.
If several threads import the same module, only one should execute its module body. Others wait or receive the cached module.
This matters because imports execute code and mutate sys.modules.
Avoid starting threads at import time.
Bad:
# module top level
threading.Thread(target=worker).start()Better:
def start_worker():
threading.Thread(target=worker).start()Let application startup control thread creation.
47.35 Threads and Interpreter Shutdown
Interpreter shutdown is a difficult phase.
At shutdown:
modules may be partially cleared
daemon threads may still be running
locks may be held
standard streams may be closing
imports may fail
globals may become NoneAvoid relying on daemon threads for cleanup.
Avoid complex __del__ methods that interact with threads.
Use explicit shutdown functions and join worker threads before process exit.
47.36 Threads and Finalizers
Finalizers can run in whichever thread causes the last reference to disappear.
class Resource:
def __del__(self):
cleanup()If a worker thread drops the last reference, cleanup may run in that worker thread.
This can be surprising when cleanup touches thread-affine resources.
Prefer context managers:
with resource:
use(resource)or explicit close methods:
resource.close()47.37 Threads and Signals
Python signal handlers run in the main thread.
This means a worker thread cannot rely on receiving KeyboardInterrupt directly.
A common shutdown pattern:
stop = threading.Event()
try:
run_main_loop()
except KeyboardInterrupt:
stop.set()
join_workers()The main thread receives the interrupt and signals workers to stop.
47.38 Threads and Asyncio
Threads and asyncio can interact, but they are different concurrency models.
asyncio uses cooperative tasks inside an event loop.
Threads use OS scheduling.
To run blocking code from async code:
import asyncio
result = await asyncio.to_thread(blocking_function, arg)To call an event loop from another thread, use thread-safe APIs such as:
loop.call_soon_threadsafe(callback)Do not directly mutate event-loop-owned state from arbitrary threads.
47.39 Threads and Multiprocessing
Threads share memory inside one process.
Processes have separate memory spaces.
| Model | Memory | CPU parallelism in traditional CPython |
|---|---|---|
| Threads | Shared | Limited for Python bytecode |
| Processes | Separate | Good |
| Async tasks | Shared event loop | Single-threaded unless offloaded |
Use threads for blocking I/O.
Use processes for pure Python CPU-bound parallelism.
Use async for high-concurrency I/O when libraries support it.
47.40 Threads and C Extensions
C extensions must handle threads carefully.
When using Python objects, extension code must hold the GIL.
When doing long native work, extension code may release the GIL.
Py_BEGIN_ALLOW_THREADS
/* native work without Python objects */
Py_END_ALLOW_THREADSExtension authors must also protect native shared state with native locks. The GIL should not be treated as a universal lock, especially for free-threaded compatibility.
47.41 Thread Debugging
Useful tools:
threading.enumerate()
threading.current_thread()
faulthandler.dump_traceback()
logging with thread names
timeouts on joins and waits
concurrent.futures futures
profilersExample:
import threading
for t in threading.enumerate():
print(t.name, t.ident, t.is_alive())Dump all thread stacks:
import faulthandler
faulthandler.dump_traceback()Thread dumps are one of the fastest ways to diagnose deadlocks.
47.42 Logging From Threads
The standard logging module is designed to be usable from multiple threads.
Include thread names in log format:
import logging
logging.basicConfig(
format="%(asctime)s %(threadName)s %(levelname)s %(message)s",
level=logging.INFO,
)Then worker logs can show which thread produced each message.
Avoid writing ad hoc logs to shared files without synchronization.
47.43 Timeouts
Blocking forever is dangerous.
Prefer timeouts when waiting across thread boundaries:
if event.wait(timeout=5):
proceed()
else:
handle_timeout()For joins:
t.join(timeout=5)
if t.is_alive():
report_stuck_thread()Timeouts do not fix concurrency bugs, but they make failures observable.
47.44 Design Rule: Own Your Threads
A component that starts a thread should usually provide a way to stop it.
class Worker:
def __init__(self):
self._stop = threading.Event()
self._thread = threading.Thread(target=self._run)
def start(self):
self._thread.start()
def stop(self):
self._stop.set()
self._thread.join()
def _run(self):
while not self._stop.is_set():
do_work()Thread ownership should be explicit.
Hidden background threads are hard to test and hard to shut down.
47.45 Good Threading Patterns
Good patterns:
producer-consumer with Queue
bounded worker pool
main thread coordinates shutdown
explicit locks around shared invariants
events for readiness and shutdown
futures for result propagation
small critical sections
timeouts for external waitsPoor patterns:
starting threads at import time
using sleep for synchronization
shared mutable globals without locks
daemon threads for important work
holding locks while calling unknown code
ignoring worker exceptions
mixing async and threads without clear ownership47.46 Minimal Threaded Server Shape
A simple threaded server often looks like:
import queue
import threading
class Server:
def __init__(self, workers=4):
self._stop = threading.Event()
self._jobs = queue.Queue()
self._threads = [
threading.Thread(target=self._worker, name=f"worker-{i}")
for i in range(workers)
]
def start(self):
for thread in self._threads:
thread.start()
def submit(self, job):
self._jobs.put(job)
def stop(self):
self._stop.set()
for _ in self._threads:
self._jobs.put(None)
for thread in self._threads:
thread.join()
def _worker(self):
while not self._stop.is_set():
job = self._jobs.get()
try:
if job is None:
return
job()
finally:
self._jobs.task_done()This design makes work delivery, shutdown, and ownership visible.
47.47 CPython Thread Internals
At a high level, CPython thread support involves:
platform thread abstraction
GIL acquisition and release
thread state allocation
current thread state tracking
lock primitives
condition primitives
thread-local storage support
interpreter shutdown coordinationThe runtime must coordinate OS-level threads with interpreter-level execution state.
A Python thread is not only an OS thread. It is an OS thread that has been connected to CPython’s interpreter state.
47.48 Free-Threaded Builds and Thread Assumptions
Free-threaded CPython changes the assumptions around threads.
Code that relied on the GIL for implicit safety may need explicit synchronization.
Examples of fragile assumptions:
dictionary compound operations are safe enough
borrowed references remain stable without ownership
C global state is protected by the GIL
container mutation does not need lockingWell-designed threaded Python code already uses application-level locks and queues. Such code is easier to adapt.
47.49 When to Use Threads
Use threads when:
tasks block on I/O
the API you must call is blocking
you need background coordination inside one process
native code releases the GIL
a small worker pool simplifies the designAvoid threads when:
the workload is pure Python CPU-heavy
shared mutable state dominates the design
shutdown cannot be made explicit
async libraries already solve the problem cleanly
process isolation is requiredThreading is a practical tool, not a universal concurrency model.
47.50 Key Points
Threads are OS execution contexts inside one process.
CPython exposes threads through _thread, threading, and concurrent.futures.
Traditional CPython uses the GIL, so pure Python bytecode usually does not run in parallel across threads.
Threads are still useful for I/O-bound work and native code that releases the GIL.
Shared mutable state requires explicit synchronization.
Use locks to protect invariants, queues to move work, events to signal state, and futures to propagate results.
Avoid starting threads at import time.
Design cooperative shutdown.
Daemon threads are unsuitable for important cleanup.
Thread bugs are usually timing bugs. Use explicit synchronization, thread dumps, logging, and timeouts.