Branch Prediction
Branch prediction is a CPU optimization.
A branch is a point where the program can go in more than one direction. The most common branch is an if.
if (value > 0) {
positive += 1;
} else {
non_positive += 1;
}The CPU does not like waiting. When it sees a branch, it tries to guess which path the program will take. If the guess is right, the CPU keeps moving quickly. If the guess is wrong, the CPU throws away some work and restarts from the correct path.
That failed guess is called a branch misprediction.
Why Branch Prediction Exists
Modern CPUs do many things at once internally.
They fetch instructions, decode them, execute them, and prepare later instructions before earlier ones are fully finished. This is called pipelining.
A branch creates uncertainty.
if (x == 0) {
doA();
} else {
doB();
}Before the CPU knows whether x == 0, it still wants to keep working. So it predicts one path.
If it predicts correctly, the program runs smoothly.
If it predicts incorrectly, the CPU must discard the work it did on the wrong path.
Predictable Branches Are Fast
A branch is cheap when the result is predictable.
Example:
for (numbers) |n| {
if (n < 1000) {
small += 1;
}
}If almost every number is below 1000, the CPU learns that the branch is usually taken.
That becomes fast.
This is also predictable:
for (numbers) |n| {
if (n >= 1000) {
large += 1;
}
}If almost no number is above 1000, the CPU learns that this branch is usually not taken.
The exact condition matters less than the pattern.
Random Branches Are Slow
A branch is harder to predict when the result looks random.
for (numbers) |n| {
if ((n & 1) == 0) {
even += 1;
} else {
odd += 1;
}
}If the input is random, the CPU may frequently guess wrong.
The branch result changes too often.
That can make the loop slower than expected.
Sorting Can Improve Branch Prediction
Suppose you process a list of users:
for (users) |user| {
if (user.active) {
processActive(user);
} else {
processInactive(user);
}
}If active and inactive users are mixed randomly, the branch may be hard to predict.
If users are grouped by status, the branch becomes easier:
active
active
active
active
inactive
inactive
inactiveThe CPU sees long runs of the same branch result.
Sometimes sorting or grouping data improves performance.
This is useful in:
- game engines
- simulations
- batch processing
- renderers
- parsers
- data pipelines
Branches Inside Hot Loops Matter Most
Do not worry about every if.
Most branches do not matter.
This branch probably does not matter:
if (config.verbose) {
std.debug.print("started\n", .{});
}It runs once.
This branch may matter:
for (pixels) |pixel| {
if (pixel.alpha == 0) {
transparent += 1;
}
}It runs millions of times.
Branch prediction matters most inside hot loops.
Branchless Code
Sometimes you can replace a branch with arithmetic or selection.
Branch version:
if (x > 0) {
count += 1;
}Branchless version:
count += @intFromBool(x > 0);The condition still exists, but the code may avoid a control-flow branch.
This can help when the branch is unpredictable.
It can hurt when the branch is predictable or when the branchless form does extra work.
Measure before trusting it.
Example: Counting Values
Branching version:
fn countPositive(values: []const i32) usize {
var count: usize = 0;
for (values) |x| {
if (x > 0) {
count += 1;
}
}
return count;
}Branchless version:
fn countPositive(values: []const i32) usize {
var count: usize = 0;
for (values) |x| {
count += @intFromBool(x > 0);
}
return count;
}The second version may be faster for random data.
But for highly predictable data, the first version may be just as fast or faster.
Avoid Work in Rare Branches
Sometimes a branch is rare but expensive.
for (items) |item| {
if (item.is_error) {
try handleError(item);
}
process(item);
}If errors are rare, this is usually fine. The CPU learns that the branch is usually false.
But the code layout can still matter in hot paths. Keep common paths simple and rare paths separate when possible.
A common style is:
for (items) |item| {
if (item.is_error) {
try handleError(item);
continue;
}
processNormal(item);
}This makes the normal path easier to read and sometimes easier for the compiler to optimize.
Early Exits Can Help Clarity
Early exits often make code clearer:
fn process(item: Item) !void {
if (!item.valid) {
return error.InvalidItem;
}
try processValid(item);
}The error case is handled first.
The main path continues without deep nesting.
This is not only a performance style. It is also a readability style.
Branch Prediction and Error Handling
Zig error handling is explicit.
const value = parseNumber(text) catch |err| {
return err;
};Most successful code paths do not fail.
So error paths are often rare branches.
That is usually good for branch prediction. The normal path is common. The error path is uncommon.
This does not mean errors are free. It means Zig’s explicit error model can still produce predictable normal paths when failures are rare.
Switch Statements
A switch is also a branch.
switch (token.kind) {
.identifier => handleIdentifier(token),
.number => handleNumber(token),
.string => handleString(token),
else => handleOther(token),
}If one case is very common, prediction may work well.
If cases are random, prediction may be harder.
For small enums, the compiler may generate efficient branch tables or comparisons. You usually should write the clearest switch first.
Optimize only when profiling shows it matters.
Function Pointers and Indirect Branches
Indirect calls can be harder to predict.
Example:
const Handler = *const fn (Item) void;
fn run(items: []const Item, handler: Handler) void {
for (items) |item| {
handler(item);
}
}The CPU must predict where the function pointer will go.
If the target function changes often, prediction becomes harder.
This matters in:
- plugin systems
- virtual dispatch patterns
- interpreters
- event systems
A direct call is usually easier to optimize than an indirect call.
Interpreters and Branch Prediction
Interpreters often use a loop like this:
while (true) {
switch (bytecode[ip]) {
.add => {},
.sub => {},
.load => {},
.store => {},
.halt => break,
}
}This dispatch loop branches constantly.
Branch prediction can strongly affect interpreter performance.
Common optimization strategies include:
- grouping common opcodes
- reducing dispatch overhead
- using direct threading where available
- specializing hot instruction sequences
- compiling bytecode to native code
For ordinary Zig programs, you do not need these techniques immediately. But they show how important branch behavior can be in systems code.
Branches and Data Layout
Branch prediction is not only about code.
It is also about data.
This layout may cause unpredictable branching:
const Entity = struct {
active: bool,
position: Vec2,
velocity: Vec2,
};If active and inactive entities are mixed randomly:
for (entities) |entity| {
if (entity.active) {
update(entity);
}
}A better layout may keep active entities in a separate list:
for (active_entities) |entity| {
update(entity);
}Now the branch disappears entirely.
This is often better than trying to make the branch faster.
Remove Branches by Changing Data
The best branch optimization is sometimes not branchless arithmetic.
It is better data organization.
Instead of:
for (jobs) |job| {
if (job.kind == .image) {
processImage(job);
} else if (job.kind == .text) {
processText(job);
}
}You may store separate queues:
for (image_jobs) |job| {
processImage(job);
}
for (text_jobs) |job| {
processText(job);
}This removes repeated type checks from the hot loop.
It can also improve cache locality, because similar data is processed together.
inline for and Compile-Time Branch Removal
Some branches can disappear at compile time.
Example:
fn process(comptime debug: bool, value: i32) void {
if (debug) {
std.debug.print("value = {}\n", .{value});
}
use(value);
}If debug is known at compile time, Zig can remove the unused branch.
This is one of the reasons comptime is powerful.
Runtime flexibility has a cost. Compile-time knowledge can remove that cost.
Do Not Overuse Branch Tricks
Branch optimization can make code ugly.
For example, replacing every if with clever arithmetic is usually bad.
Clear code is easier to maintain.
Use branch tricks only when:
- profiling shows a hot branch
- the branch is unpredictable
- the replacement is measurably faster
- the code remains understandable
Performance work should be evidence-based.
Practical Rules
Write the obvious branch first.
if (condition) {
doThing();
}Then measure.
If profiling shows branch misprediction is a real problem, consider:
- grouping data by branch outcome
- moving rare cases out of hot paths
- splitting mixed loops into separate loops
- using branchless arithmetic for simple conditions
- replacing indirect calls with direct calls
- using compile-time parameters to remove branches
Most programs do not need manual branch prediction tuning everywhere.
But performance-critical programs benefit from understanding the idea.
Mental Model
A branch asks the CPU to guess.
Predictable branches are cheap.
Random branches are expensive.
The strongest optimization is often to organize data so the branch becomes predictable or disappears.
In Zig, you have enough control over data layout, control flow, and compile-time parameters to make that possible.