The gc module exposes CPython’s cyclic garbage collector. It gives Python code a way to inspect, configure, disable, enable, and manually run the collector that handles unreachable reference cycles.
CPython’s main memory management mechanism is reference counting. Most objects are destroyed as soon as their reference count reaches zero. The cyclic garbage collector exists for the cases reference counting cannot solve: groups of objects that keep each other alive even though no live outside reference can reach them.
55.1 Why CPython Needs gc
Reference counting handles simple object lifetime well.
x = []
y = x
del x
del yAfter the last reference disappears, the list can be destroyed immediately.
Cycles are different:
a = []
b = []
a.append(b)
b.append(a)
del a
del bThe two lists still reference each other. Their reference counts remain nonzero, but the program can no longer reach them.
Conceptually:
outside references: none
list A → list B
list B → list AReference counting alone cannot prove this group is garbage. The cyclic collector finds such groups.
55.2 What the Collector Tracks
The cyclic collector tracks container objects.
A tracked object is an object that may contain references to other Python objects.
Common tracked objects include:
| Object kind | Usually tracked? | Reason |
|---|---|---|
list | Yes | Contains references |
dict | Sometimes | Can contain references |
set | Yes | Contains references |
tuple | Sometimes | Depends on contents |
| Function | Yes | References globals, defaults, closure |
| Class | Yes | References dict, bases, descriptors |
| Frame | Yes | References locals, stack, globals |
| Integer | No | Does not contain object references |
| Many strings | No | Does not contain object references |
The collector does not need to track every object. An object that cannot point to another object cannot participate in a reference cycle.
55.3 The gc Module Interface
The main control functions are:
| Function | Purpose |
|---|---|
gc.enable() | Enable automatic collection |
gc.disable() | Disable automatic collection |
gc.isenabled() | Return whether automatic collection is enabled |
gc.collect() | Run a collection manually |
gc.get_count() | Return allocation counters |
gc.get_threshold() | Return collection thresholds |
gc.set_threshold() | Change thresholds |
gc.get_objects() | Return tracked objects |
gc.is_tracked() | Test if an object is tracked |
gc.get_referrers() | Find objects referring to an object |
gc.get_referents() | Find objects referenced by an object |
Example:
import gc
print(gc.isenabled())
print(gc.get_count())
print(gc.get_threshold())
collected = gc.collect()
print(collected)The module is mostly diagnostic and control-oriented. Normal Python programs rarely need to call it directly.
55.4 Reference Counting and Cyclic GC Together
The two systems cooperate.
reference counting
handles ordinary lifetime immediately
cyclic GC
periodically detects unreachable cyclesFor most short-lived objects, reference counting does all the work. The cyclic collector runs less frequently and focuses on container objects.
Example:
def f():
xs = [1, 2, 3]
return sum(xs)The list xs usually disappears as soon as the function returns because its reference count drops to zero.
A cycle may survive longer:
def f():
xs = []
xs.append(xs)When f() returns, xs has no external reference, but it references itself. The cyclic collector must clean it later.
55.5 Generations
Traditional CPython cyclic GC uses generations.
The practical idea is simple: most objects die young. Recently allocated objects are checked more often than older objects.
Conceptually:
generation 0: young objects
generation 1: older objects
generation 2: oldest objectsA collection of generation 0 checks young objects. Survivors may be promoted. Older generations are collected less frequently.
This reduces overhead because the collector avoids scanning all tracked objects every time.
The public API exposes generation-oriented functions:
import gc
gc.collect(0) # collect youngest generation
gc.collect(1)
gc.collect(2) # collect full set in traditional modelExact generation behavior is version-dependent. Treat the public API as stable, but avoid depending on internal generation details.
55.6 Thresholds
Automatic collection is driven by allocation counters and thresholds.
import gc
print(gc.get_count())
print(gc.get_threshold())gc.get_count() returns counters for tracked allocations.
gc.get_threshold() returns threshold values used to decide when collection should run.
Example:
import gc
old = gc.get_threshold()
gc.set_threshold(1000, 10, 10)
try:
# workload here
pass
finally:
gc.set_threshold(*old)Changing thresholds affects when automatic cyclic collection happens. It does not disable reference counting.
55.7 Manual Collection
gc.collect() forces collection.
import gc
n = gc.collect()
print(n)The return value is the number of unreachable objects collected.
Manual collection is useful for:
tests that assert cleanup
memory diagnostics
long-running batch phases
debugging cycles
embedding scenariosIt is usually a poor substitute for fixing object lifetime problems.
A common pattern in tests:
import gc
import weakref
class Node:
pass
obj = Node()
ref = weakref.ref(obj)
del obj
gc.collect()
assert ref() is None55.8 Tracked vs Untracked Objects
You can ask whether an object is tracked:
import gc
print(gc.is_tracked([]))
print(gc.is_tracked(123))
print(gc.is_tracked("abc"))A list is usually tracked. An integer usually is not.
Some containers may be untracked as an optimization when they only contain atomic objects.
Example:
import gc
t = (1, 2, 3)
print(gc.is_tracked(t))The exact result may vary depending on interpreter version and collector state.
Do not write program logic that depends on gc.is_tracked() for ordinary application behavior. Use it for diagnostics.
55.9 Referents
gc.get_referents(obj) returns objects directly referenced by obj.
Example:
import gc
xs = [1, "a", {}]
print(gc.get_referents(xs))For a list, referents are its elements.
For a function, referents may include its globals, defaults, annotations, code object, closure, and other internal fields.
This function exposes the object graph from the perspective of CPython’s traversal protocol.
Conceptually:
object
↓
objects it directly points toThe collector uses similar traversal logic internally.
55.10 Referrers
gc.get_referrers(obj) returns objects that directly refer to obj.
Example:
import gc
x = []
holder = {"x": x}
refs = gc.get_referrers(x)
print(any(r is holder for r in refs))This is useful but dangerous.
The returned referrers may include:
current stack frames
locals dictionaries
temporary objects created by inspection
internal interpreter structures
test harness objects
debugger objectsBecause calling get_referrers() itself creates references, results can be noisy.
It is primarily a debugging tool.
55.11 Object Graph Traversal
The cyclic collector sees memory as a graph.
object A → object B
object B → object C
object C → object AA cycle becomes collectable only when no live root can reach it.
Roots include active frames, module globals, thread state, interpreter state, and objects reachable from them.
Simplified model:
live roots
↓
reachable objects survive
unreachable tracked group
↓
candidate garbageThe collector must distinguish internal references within an unreachable group from external references coming from live objects.
55.12 Finalizers
Objects may define finalizers.
class FileLike:
def __del__(self):
print("closing")Finalizers complicate cycle collection because destruction order matters.
Example:
class Node:
def __del__(self):
print("finalize")
a = Node()
b = Node()
a.other = b
b.other = a
del a
del bModern CPython can handle many cycles with finalizers, but finalization remains a subtle area. The safe rule is simple: avoid depending on __del__ for important resource cleanup.
Use context managers instead:
with open("data.txt") as f:
data = f.read()Context managers make cleanup explicit and deterministic.
55.13 Weak References
Weak references help avoid ownership cycles.
Example:
import weakref
class Parent:
pass
class Child:
pass
parent = Parent()
child = Child()
child.parent = weakref.ref(parent)The weak reference does not increase the parent’s reference count. This helps with graphs such as:
parent → child
child → parent weaklyWeak references are common in caches, observer lists, parent pointers, memoization tables, and object registries.
55.14 gc.garbage
gc.garbage contains objects the collector found unreachable but could not collect.
import gc
print(gc.garbage)In modern CPython, this list is usually empty unless debugging flags or special extension types are involved.
Historically, cycles with finalizers often ended up here. Modern finalization rules reduced that problem.
Still, extension types with incorrect GC support can create uncollectable objects.
55.15 Debug Flags
The gc module supports debug flags.
Example:
import gc
gc.set_debug(gc.DEBUG_STATS)
gc.collect()
gc.set_debug(0)Common flags include:
| Flag | Meaning |
|---|---|
DEBUG_STATS | Print collection statistics |
DEBUG_COLLECTABLE | Print collectable objects |
DEBUG_UNCOLLECTABLE | Print uncollectable objects |
DEBUG_SAVEALL | Save unreachable objects in gc.garbage instead of freeing them |
DEBUG_SAVEALL is useful for postmortem analysis:
import gc
gc.set_debug(gc.DEBUG_SAVEALL)
gc.collect()
for obj in gc.garbage:
print(type(obj))Clear gc.garbage after inspection to release objects.
55.16 Freezing Objects
gc.freeze() moves currently tracked objects into a permanent generation.
import gc
gc.freeze()This is useful in some process-forking scenarios. If many long-lived objects are frozen before fork(), the collector avoids touching them later, which can help preserve copy-on-write memory sharing.
Related functions:
gc.freeze()
gc.unfreeze()
gc.get_freeze_count()This feature targets specialized runtime and deployment cases, not ordinary application logic.
55.17 Frames and Cycles
Frames are common sources of cycles.
Example:
import sys
def f():
frame = sys._getframe()
return frame
frame = f()The frame references its locals. The local variable frame references the frame itself.
Conceptually:
frame
↓
locals
↓
frameTracebacks also keep frames alive:
try:
1 / 0
except Exception as exc:
saved = excThe exception holds a traceback. The traceback holds frames. Frames hold locals. This can retain large object graphs.
A cleanup pattern:
try:
1 / 0
except Exception as exc:
# inspect exc
pass
finally:
exc = NoneIn modern Python, exception variables in except blocks are cleared automatically after the block, but saved exceptions can still retain tracebacks.
55.18 Extension Types and GC Support
C extension types that contain Python object references must cooperate with the cyclic collector.
A container extension type usually needs:
tp_traverse
tp_clear
Py_TPFLAGS_HAVE_GC
GC-aware allocation
GC-aware deallocationConceptual responsibilities:
| Slot or mechanism | Purpose |
|---|---|
tp_traverse | Visit contained references |
tp_clear | Break internal references during collection |
| GC allocation | Register object with collector |
| GC deallocation | Untrack object before destruction |
If an extension type stores Python object pointers but does not expose them to the collector, cycles involving that type may leak.
Simplified C shape:
static int
Node_traverse(NodeObject *self, visitproc visit, void *arg)
{
Py_VISIT(self->child);
return 0;
}
static int
Node_clear(NodeObject *self)
{
Py_CLEAR(self->child);
return 0;
}This pattern lets the collector discover and break cycles.
55.19 tp_traverse
tp_traverse tells the collector which Python objects are reachable from a container object.
Conceptually:
for each PyObject* field inside this object:
visit(field)The collector uses this to build reachability information.
For a tree node extension type:
typedef struct {
PyObject_HEAD
PyObject *left;
PyObject *right;
PyObject *value;
} NodeObject;Traversal should visit all Python object fields:
static int
Node_traverse(NodeObject *self, visitproc visit, void *arg)
{
Py_VISIT(self->left);
Py_VISIT(self->right);
Py_VISIT(self->value);
return 0;
}Missing one field can produce incorrect collection behavior.
55.20 tp_clear
tp_clear breaks references during cycle collection.
static int
Node_clear(NodeObject *self)
{
Py_CLEAR(self->left);
Py_CLEAR(self->right);
Py_CLEAR(self->value);
return 0;
}Py_CLEAR sets the pointer to NULL before decrementing the reference. This is safer than a plain Py_DECREF when breaking cycles because finalization may re-enter code.
The collector calls tp_clear on unreachable containers to break internal reference cycles and allow reference counts to fall to zero.
55.21 Disabling GC
You can disable automatic cyclic collection:
import gc
gc.disable()
try:
# critical allocation-heavy section
pass
finally:
gc.enable()This does not disable reference counting.
Objects whose reference counts reach zero are still destroyed.
Disabling cyclic GC can help in specialized workloads where:
object graphs are known to be acyclic
latency spikes from GC are unacceptable
manual collection points are controlledIt can also cause memory growth if cycles are created.
55.22 Memory Diagnostics Pattern
A practical leak investigation often follows this pattern:
import gc
import weakref
class Node:
pass
obj = Node()
ref = weakref.ref(obj)
del obj
gc.collect()
if ref() is not None:
print("object is still alive")
print(gc.get_referrers(ref()))For container graphs:
import gc
gc.set_debug(gc.DEBUG_SAVEALL)
gc.collect()
for obj in gc.garbage:
print(type(obj), repr(obj)[:80])
gc.garbage.clear()
gc.set_debug(0)This is intrusive. Use it in tests, debugging sessions, or isolated reproduction scripts.
55.23 Performance Considerations
Cyclic GC has runtime cost because it must scan tracked containers.
Costs come from:
maintaining allocation counters
tracking container objects
traversing object graphs
handling finalizers
promoting survivors
clearing unreachable cyclesFor most programs, default settings are sufficient.
Tune GC only with evidence:
latency measurements
allocation profiles
memory growth data
object graph analysis
production tracesCommon mistakes include:
calling gc.collect() too frequently
disabling GC globally without cycle analysis
using get_referrers() in production paths
keeping frame objects alive accidentally
storing tracebacks indefinitely55.24 Relationship to sys
sys.getrefcount() and gc often appear together.
import gc
import sys
x = []
print(sys.getrefcount(x))
print(gc.is_tracked(x))
print(gc.get_referents(x))They expose different layers:
| Module | Layer |
|---|---|
sys.getrefcount() | Direct reference count |
gc.is_tracked() | Cyclic GC tracking state |
gc.get_referents() | Traversal graph |
gc.collect() | Cycle collection |
Reference counts answer “how many references exist?”
GC traversal answers “what graph can this object reach?”
Both are CPython-specific enough that they should be used carefully in portable code.
55.25 Chapter Summary
The gc module exposes CPython’s cyclic garbage collector. CPython primarily uses reference counting, but reference counting cannot reclaim unreachable cycles. The cyclic collector tracks container objects, traverses object graphs, detects unreachable groups, and breaks cycles so objects can be destroyed.
For normal application code, gc is rarely needed. For CPython internals, extension modules, memory leak debugging, frame and traceback analysis, and long-running systems, it is essential.