# 84. GDB and LLDB

# 84. GDB and LLDB

Native debuggers are essential when working on CPython internals. Python-level tools can show frames, variables, exceptions, and source lines, but they cannot fully explain crashes inside the C runtime. When a bug involves reference counts, object layout, frame state, memory corruption, invalid pointers, or C extension behavior, you need a native debugger.

On Linux, CPython developers usually use GDB. On macOS, LLDB is common. Both can inspect C stack frames, breakpoints, registers, heap objects, shared libraries, and raw memory.

## 84.1 Why Native Debugging Matters

CPython is a C program that executes Python code. A Python exception and a C crash are different events.

A Python exception is managed by the interpreter:

```python
raise ValueError("bad input")
```

The interpreter stores exception state, unwinds Python frames, runs handlers, and reports a traceback.

A C crash happens below Python’s exception machinery:

```c
PyObject *op = NULL;
Py_INCREF(op);      /* invalid */
```

This may produce a segmentation fault before Python can raise anything.

Native debuggers help diagnose:

```text
segmentation faults
assertion failures
abort calls
deadlocks
invalid reference counts
use-after-free bugs
stack overflows
C extension crashes
interpreter state corruption
```

## 84.2 Build Configuration for Debugging

Use a debug build:

```bash
./configure --with-pydebug
make -j8
```

A debug build keeps more checks enabled and makes many interpreter errors fail earlier.

For better source-level debugging, avoid aggressive optimization:

```bash
./configure --with-pydebug CFLAGS="-O0 -g3"
make -j8
```

This gives the debugger clearer stack frames and more reliable local variables.

A common practical build is:

```bash
make clean
./configure --with-pydebug CFLAGS="-O0 -g3"
make -j8
```

## 84.3 Starting CPython Under GDB

Run the local interpreter under GDB:

```bash
gdb --args ./python script.py
```

Inside GDB:

```gdb
run
```

If the process crashes, get a backtrace:

```gdb
bt
```

A fuller backtrace:

```gdb
bt full
```

This shows C frames and local variables.

You can also pass module execution:

```bash
gdb --args ./python -m test test_gc
```

Then:

```gdb
run
```

## 84.4 Starting CPython Under LLDB

On macOS:

```bash
lldb -- ./python script.py
```

Inside LLDB:

```lldb
run
```

After a crash:

```lldb
bt
```

To inspect a frame:

```lldb
frame select 0
frame variable
```

To print a C expression:

```lldb
expr some_variable
```

LLDB syntax differs from GDB, but the debugging model is similar.

## 84.5 C Backtraces vs Python Backtraces

A C backtrace shows native function calls.

Example shape:

```text
#0  PyObject_Free
#1  list_dealloc
#2  Py_DECREF
#3  _PyEval_EvalFrameDefault
#4  PyEval_EvalCode
#5  run_mod
#6  pyrun_file
#7  Py_RunMain
#8  pymain_main
#9  main
```

A Python traceback shows Python frames:

```text
Traceback (most recent call last):
  File "script.py", line 7, in <module>
    f()
  File "script.py", line 4, in f
    g()
  File "script.py", line 2, in g
    crash()
```

Both are useful. The Python traceback tells you what user code was executing. The C backtrace tells you where the interpreter failed.

## 84.6 CPython GDB Helpers

CPython includes GDB helper commands for inspecting Python state.

These are usually loaded from:

```text
Tools/gdb/libpython.py
```

When configured correctly, GDB gains commands such as:

```gdb
py-bt
py-bt-full
py-list
py-print
py-locals
```

These commands translate internal CPython structures into Python-level information.

Example:

```gdb
py-bt
```

Output shape:

```text
Traceback (most recent call first):
  File "script.py", line 12, in inner
  File "script.py", line 16, in outer
  File "script.py", line 20, in <module>
```

This is often much more useful than a raw C backtrace alone.

## 84.7 Loading GDB Helpers Manually

If GDB does not load CPython helpers automatically, load them manually:

```gdb
source Tools/gdb/libpython.py
```

Then verify:

```gdb
help py-bt
```

Some systems restrict auto-loading for security. GDB may print a warning telling you to add a safe path.

Example `.gdbinit` entry:

```gdb
add-auto-load-safe-path /path/to/cpython
```

Use the absolute path to your CPython checkout.

## 84.8 Important GDB Commands

Basic execution:

```gdb
run
continue
next
step
finish
```

Backtrace and frames:

```gdb
bt
bt full
frame 0
up
down
```

Breakpoints:

```gdb
break PyErr_SetString
break _PyEval_EvalFrameDefault
break list_dealloc
break dictresize
```

Printing values:

```gdb
print op
print *op
print Py_TYPE(op)
print Py_REFCNT(op)
```

Examining memory:

```gdb
x/16gx op
x/32xb op
```

Watchpoints:

```gdb
watch op->ob_refcnt
```

Watchpoints are useful when a reference count changes unexpectedly.

## 84.9 Important LLDB Commands

Basic execution:

```lldb
run
continue
next
step
finish
```

Backtrace and frames:

```lldb
bt
frame select 0
up
down
```

Breakpoints:

```lldb
breakpoint set --name PyErr_SetString
breakpoint set --name _PyEval_EvalFrameDefault
breakpoint set --name list_dealloc
```

Printing values:

```lldb
expr op
expr *op
expr Py_TYPE(op)
expr Py_REFCNT(op)
```

Memory:

```lldb
memory read op
memory read --format x --count 16 op
```

LLDB expression parsing can be stricter than GDB. For complex macros, you may need to inspect fields directly.

## 84.10 Breaking on Python Exceptions

Python exceptions are normal control flow. Many exceptions are raised, caught, and ignored internally.

Breaking on every `PyErr_SetString` can be noisy:

```gdb
break PyErr_SetString
```

This stops whenever CPython sets an exception with a string message.

More targeted options include:

```gdb
break PyErr_SetObject
break PyErr_Format
break PyErr_NoMemory
```

This is useful when you need to find where a specific exception originates.

A practical pattern:

```gdb
break PyErr_SetString if exception == PyExc_TypeError
```

Conditional breakpoints reduce noise.

## 84.11 Breaking on Fatal Errors

Fatal runtime failures often call:

```text
Py_FatalError
_Py_FatalErrorFunc
abort
```

Useful breakpoints:

```gdb
break Py_FatalError
break _Py_FatalErrorFunc
break abort
```

Then:

```gdb
run
bt full
```

This stops before the process terminates, preserving context.

## 84.12 Debugging a Segmentation Fault

Suppose this command crashes:

```bash
./python script.py
```

Run:

```bash
gdb --args ./python script.py
```

Inside GDB:

```gdb
run
bt full
py-bt
frame 0
```

Then inspect important variables:

```gdb
info locals
print op
print *op
print Py_TYPE(op)
print Py_REFCNT(op)
```

Common patterns:

```text
op == NULL
    invalid NULL dereference

Py_REFCNT(op) <= 0
    use-after-free or double decref

Py_TYPE(op) invalid
    memory corruption

crash in dealloc
    object lifetime bug

crash far from changed code
    earlier corruption
```

## 84.13 Debugging Reference Count Bugs

Reference count bugs often appear as crashes during deallocation.

Example failure path:

```text
extra Py_DECREF
    → object freed too early
        → pointer reused later
            → crash in unrelated code
```

Useful strategies:

```text
break on the object allocator
break on the deallocator
watch object reference count
run focused tests repeatedly
use debug build and refleak tests
```

Example watchpoint:

```gdb
watch ((PyObject *)op)->ob_refcnt
```

Then continue:

```gdb
continue
```

GDB stops whenever the reference count changes.

This is powerful but slow.

## 84.14 Debugging Garbage Collector Bugs

GC bugs often involve invalid traversal, clearing, or tracking.

Useful breakpoints:

```gdb
break collect
break delete_garbage
break PyObject_GC_Track
break PyObject_GC_UnTrack
```

Common causes:

```text
object tracked before full initialization
tp_traverse misses a contained reference
tp_clear fails to clear a strong reference
object deallocated while still tracked
wrong allocator used for GC object
```

For container types, inspect:

```text
tp_traverse
tp_clear
tp_dealloc
tp_alloc
tp_free
```

A GC crash usually means the object graph metadata has become inconsistent.

## 84.15 Debugging the Evaluation Loop

The evaluation loop executes bytecode instructions.

Important function:

```text
_PyEval_EvalFrameDefault
```

A breakpoint here is often too broad because every Python frame passes through it.

More targeted strategies:

```text
break on a specific opcode handler
break after a specific exception
break on a function called by the opcode
use py-bt to identify Python frame
inspect frame object and instruction pointer
```

Useful values include:

```text
current frame
code object
instruction pointer
stack pointer
locals array
```

The exact internal names vary across CPython versions, so read the local variables in the active frame.

## 84.16 Debugging Import Failures

Import bugs can involve Python and C code.

Useful breakpoints:

```gdb
break PyImport_ImportModule
break PyImport_ImportModuleLevelObject
```

Python-level helpers:

```python
import importlib
import sys

print(sys.meta_path)
print(sys.path)
print(sys.modules)
```

Import failures may be caused by:

```text
wrong module state
partially initialized module
recursive import
bad extension module initialization
incorrect sys.path
failed loader protocol
```

Use both Python-level inspection and C-level breakpoints.

## 84.17 Debugging C Extensions

When debugging a C extension, start CPython under the debugger and load the extension normally:

```bash
gdb --args ./python -c "import myext; myext.run()"
```

Set breakpoints in extension functions:

```gdb
break myext_run
```

If symbols are missing, rebuild the extension with debug symbols:

```bash
CFLAGS="-O0 -g3" python setup.py build_ext --inplace
```

Common extension crash causes:

```text
returning borrowed references as new references
missing Py_INCREF before returning
incorrect argument parsing
using Python APIs without the GIL
wrong module state handling
invalid buffer lifetime
```

## 84.18 Debugging Deadlocks

Deadlocks require a different approach.

When the process hangs, attach to it:

```bash
gdb -p <pid>
```

Then:

```gdb
thread apply all bt
```

This prints all native thread stacks.

Look for threads blocked on:

```text
GIL acquisition
mutex locks
condition variables
I/O
thread joins
fork-related locks
import lock
```

On LLDB:

```lldb
process attach --pid <pid>
thread backtrace all
```

Deadlocks often require examining every thread, not just the main thread.

## 84.19 Debugging Test Failures Under GDB

Run a specific CPython test under GDB:

```bash
gdb --args ./python -m test -v test_gc
```

For a single test method, you can often run the test module directly:

```bash
gdb --args ./python Lib/test/test_gc.py
```

When using `regrtest`, remember that some tests spawn subprocesses. If the crash happens in a child process, configure the debugger to follow forks:

```gdb
set follow-fork-mode child
```

or disable parallelism and subprocess-heavy options where possible.

## 84.20 Core Files

A core file is a saved memory image of a crashed process.

Enable core dumps on Unix-like systems:

```bash
ulimit -c unlimited
```

After a crash:

```bash
gdb ./python core
```

Then:

```gdb
bt full
py-bt
```

Core files are useful when crashes happen outside an interactive debugger, such as CI or long-running tests.

## 84.21 Practical Debugging Order

For a crash:

```text
1. Rebuild with --with-pydebug and debug symbols.
2. Reproduce with the smallest command.
3. Run under GDB or LLDB.
4. Capture C backtrace.
5. Capture Python backtrace if possible.
6. Inspect top frame locals.
7. Inspect suspicious PyObject pointers.
8. Check refcount and type pointer.
9. Add targeted breakpoints.
10. Use watchpoints for lifetime bugs.
```

For a reference leak:

```text
1. Run the relevant test with -R.
2. Reduce to one failing test.
3. Inspect success and error paths.
4. Check new, borrowed, and stolen references.
5. Add temporary logging if needed.
6. Use watchpoints only after identifying an object.
```

For a deadlock:

```text
1. Attach to the hung process.
2. Dump all thread backtraces.
3. Identify blocked locks.
4. Check GIL, import lock, and condition variables.
5. Reproduce with fewer threads.
```

## 84.22 Core Principle

GDB and LLDB let you inspect CPython as the C runtime that it is.

Use Python tracebacks to understand the program. Use native backtraces to understand the interpreter. Use CPython debugger helpers to connect both views. For internals work, that combined view is often the difference between guessing and diagnosing.
