Interpreter dispatch, inline caches, the specializing adaptive interpreter, function call fast paths, vectorcall, dictionary and attribute access performance, memory locality, profiling, and benchmarking.
| Chapter | Title |
|---|---|
| 72 | Interpreter Dispatch |
| 73 | Inline Caches |
| 74 | Specializing Adaptive Interpreter |
| 75 | Function Call Fast Paths |
| 76 | Vectorcall |
| 77 | Dictionary Performance |
| 78 | Attribute Access Performance |
| 79 | Memory Locality |
| 80 | Profiling CPython |
| 81 | Benchmarking CPython |
72. Interpreter DispatchComputed goto dispatch table, the switch fallback, and how opcode prediction reduces branch mispredictions.
73. Inline CachesInline cache entries appended to CACHE instructions in the bytecode array and their layout per opcode family.
74. Specializing Adaptive InterpreterAdaptive counter logic, specialization guards, and how CPython 3.11+ rewrites opcodes to LOAD_ATTR_SLOT and friends.
75. Function Call Fast PathsCALL_PY_EXACT_ARGS, CALL_BUILTIN_FAST, and the fast-path conditions that bypass the generic call machinery.
76. VectorcallPEP 590 vectorcall protocol, _Py_TPFLAGS_HAVE_VECTORCALL, and stack-based argument passing for zero-overhead calls.
77. Dictionary PerformanceCompact dict memory layout, hash collision probing strategy, split-table sharing, and lookup specialization.
78. Attribute Access PerformanceType version tags, LOAD_ATTR inline cache hits, and the attribute specialization guards for slots and descriptors.
79. Memory LocalityObject allocation locality, arena page placement, freelists for common types, and cache-line–aware design.
80. Profiling CPythonUsing perf, Instruments, and py-spy to profile CPython; reading perf maps generated by the JIT.
81. Benchmarking CPythonpyperformance benchmark suite, microbenchmark pitfalls, timer resolution, and interpreter warm-up effects.