Why Branch Prediction Exists

Branch Prediction

Branch prediction is a CPU optimization.

A branch is a point where the program can go in more than one direction. The most common branch is an if.

if (value > 0) {
    positive += 1;
} else {
    non_positive += 1;
}

The CPU does not like waiting. When it sees a branch, it tries to guess which path the program will take. If the guess is right, the CPU keeps moving quickly. If the guess is wrong, the CPU throws away some work and restarts from the correct path.

That failed guess is called a branch misprediction.

Why Branch Prediction Exists

Modern CPUs do many things at once internally.

They fetch instructions, decode them, execute them, and prepare later instructions before earlier ones are fully finished. This is called pipelining.

A branch creates uncertainty.

if (x == 0) {
    doA();
} else {
    doB();
}

Before the CPU knows whether x == 0, it still wants to keep working. So it predicts one path.

If it predicts correctly, the program runs smoothly.

If it predicts incorrectly, the CPU must discard the work it did on the wrong path.

Predictable Branches Are Fast

A branch is cheap when the result is predictable.

Example:

for (numbers) |n| {
    if (n < 1000) {
        small += 1;
    }
}

If almost every number is below 1000, the CPU learns that the branch is usually taken.

That becomes fast.

This is also predictable:

for (numbers) |n| {
    if (n >= 1000) {
        large += 1;
    }
}

If almost no number is above 1000, the CPU learns that this branch is usually not taken.

The exact condition matters less than the pattern.

Random Branches Are Slow

A branch is harder to predict when the result looks random.

for (numbers) |n| {
    if ((n & 1) == 0) {
        even += 1;
    } else {
        odd += 1;
    }
}

If the input is random, the CPU may frequently guess wrong.

The branch result changes too often.

That can make the loop slower than expected.

Sorting Can Improve Branch Prediction

Suppose you process a list of users:

for (users) |user| {
    if (user.active) {
        processActive(user);
    } else {
        processInactive(user);
    }
}

If active and inactive users are mixed randomly, the branch may be hard to predict.

If users are grouped by status, the branch becomes easier:

active
active
active
active
inactive
inactive
inactive

The CPU sees long runs of the same branch result.

Sometimes sorting or grouping data improves performance.

This is useful in:

game engines
simulations
batch processing
renderers
parsers
data pipelines

Branches Inside Hot Loops Matter Most

Do not worry about every if.

Most branches do not matter.

This branch probably does not matter:

if (config.verbose) {
    std.debug.print("started\n", .{});
}

It runs once.

This branch may matter:

for (pixels) |pixel| {
    if (pixel.alpha == 0) {
        transparent += 1;
    }
}

It runs millions of times.

Branch prediction matters most inside hot loops.

Branchless Code

Sometimes you can replace a branch with arithmetic or selection.

Branch version:

if (x > 0) {
    count += 1;
}

Branchless version:

count += @intFromBool(x > 0);

The condition still exists, but the code may avoid a control-flow branch.

This can help when the branch is unpredictable.

It can hurt when the branch is predictable or when the branchless form does extra work.

Measure before trusting it.

Example: Counting Values

Branching version:

fn countPositive(values: []const i32) usize {
    var count: usize = 0;

    for (values) |x| {
        if (x > 0) {
            count += 1;
        }
    }

    return count;
}

Branchless version:

fn countPositive(values: []const i32) usize {
    var count: usize = 0;

    for (values) |x| {
        count += @intFromBool(x > 0);
    }

    return count;
}

The second version may be faster for random data.

But for highly predictable data, the first version may be just as fast or faster.

Avoid Work in Rare Branches

Sometimes a branch is rare but expensive.

for (items) |item| {
    if (item.is_error) {
        try handleError(item);
    }

    process(item);
}

If errors are rare, this is usually fine. The CPU learns that the branch is usually false.

But the code layout can still matter in hot paths. Keep common paths simple and rare paths separate when possible.

A common style is:

for (items) |item| {
    if (item.is_error) {
        try handleError(item);
        continue;
    }

    processNormal(item);
}

This makes the normal path easier to read and sometimes easier for the compiler to optimize.

Early Exits Can Help Clarity

Early exits often make code clearer:

fn process(item: Item) !void {
    if (!item.valid) {
        return error.InvalidItem;
    }

    try processValid(item);
}

The error case is handled first.

The main path continues without deep nesting.

This is not only a performance style. It is also a readability style.

Branch Prediction and Error Handling

Zig error handling is explicit.

const value = parseNumber(text) catch |err| {
    return err;
};

Most successful code paths do not fail.

So error paths are often rare branches.

That is usually good for branch prediction. The normal path is common. The error path is uncommon.

This does not mean errors are free. It means Zig’s explicit error model can still produce predictable normal paths when failures are rare.

Switch Statements

A switch is also a branch.

switch (token.kind) {
    .identifier => handleIdentifier(token),
    .number => handleNumber(token),
    .string => handleString(token),
    else => handleOther(token),
}

If one case is very common, prediction may work well.

If cases are random, prediction may be harder.

For small enums, the compiler may generate efficient branch tables or comparisons. You usually should write the clearest switch first.

Optimize only when profiling shows it matters.

Function Pointers and Indirect Branches

Indirect calls can be harder to predict.

Example:

const Handler = *const fn (Item) void;

fn run(items: []const Item, handler: Handler) void {
    for (items) |item| {
        handler(item);
    }
}

The CPU must predict where the function pointer will go.

If the target function changes often, prediction becomes harder.

This matters in:

plugin systems
virtual dispatch patterns
interpreters
event systems

A direct call is usually easier to optimize than an indirect call.

Interpreters and Branch Prediction

Interpreters often use a loop like this:

while (true) {
    switch (bytecode[ip]) {
        .add => {},
        .sub => {},
        .load => {},
        .store => {},
        .halt => break,
    }
}

This dispatch loop branches constantly.

Branch prediction can strongly affect interpreter performance.

Common optimization strategies include:

grouping common opcodes
reducing dispatch overhead
using direct threading where available
specializing hot instruction sequences
compiling bytecode to native code

For ordinary Zig programs, you do not need these techniques immediately. But they show how important branch behavior can be in systems code.

Branches and Data Layout

Branch prediction is not only about code.

It is also about data.

This layout may cause unpredictable branching:

const Entity = struct {
    active: bool,
    position: Vec2,
    velocity: Vec2,
};

If active and inactive entities are mixed randomly:

for (entities) |entity| {
    if (entity.active) {
        update(entity);
    }
}

A better layout may keep active entities in a separate list:

for (active_entities) |entity| {
    update(entity);
}

Now the branch disappears entirely.

This is often better than trying to make the branch faster.

Remove Branches by Changing Data

The best branch optimization is sometimes not branchless arithmetic.

It is better data organization.

Instead of:

for (jobs) |job| {
    if (job.kind == .image) {
        processImage(job);
    } else if (job.kind == .text) {
        processText(job);
    }
}

You may store separate queues:

for (image_jobs) |job| {
    processImage(job);
}

for (text_jobs) |job| {
    processText(job);
}

This removes repeated type checks from the hot loop.

It can also improve cache locality, because similar data is processed together.

`inline for` and Compile-Time Branch Removal

Some branches can disappear at compile time.

Example:

fn process(comptime debug: bool, value: i32) void {
    if (debug) {
        std.debug.print("value = {}\n", .{value});
    }

    use(value);
}

If debug is known at compile time, Zig can remove the unused branch.

This is one of the reasons comptime is powerful.

Runtime flexibility has a cost. Compile-time knowledge can remove that cost.

Do Not Overuse Branch Tricks

Branch optimization can make code ugly.

For example, replacing every if with clever arithmetic is usually bad.

Clear code is easier to maintain.

Use branch tricks only when:

profiling shows a hot branch
the branch is unpredictable
the replacement is measurably faster
the code remains understandable

Performance work should be evidence-based.

Practical Rules

Write the obvious branch first.

if (condition) {
    doThing();
}

Then measure.

If profiling shows branch misprediction is a real problem, consider:

grouping data by branch outcome
moving rare cases out of hot paths
splitting mixed loops into separate loops
using branchless arithmetic for simple conditions
replacing indirect calls with direct calls
using compile-time parameters to remove branches

Most programs do not need manual branch prediction tuning everywhere.

But performance-critical programs benefit from understanding the idea.

Mental Model

A branch asks the CPU to guess.

Predictable branches are cheap.

Random branches are expensive.

The strongest optimization is often to organize data so the branch becomes predictable or disappears.

In Zig, you have enough control over data layout, control flow, and compile-time parameters to make that possible.

Why Branch Prediction Exists

Branch Prediction

Why Branch Prediction Exists

Predictable Branches Are Fast

Random Branches Are Slow

Sorting Can Improve Branch Prediction

Branches Inside Hot Loops Matter Most

Branchless Code

Example: Counting Values

Avoid Work in Rare Branches

Early Exits Can Help Clarity

Branch Prediction and Error Handling

Switch Statements

Function Pointers and Indirect Branches

Interpreters and Branch Prediction

Branches and Data Layout

Remove Branches by Changing Data

inline for and Compile-Time Branch Removal

Do Not Overuse Branch Tricks

Practical Rules

Mental Model

`inline for` and Compile-Time Branch Removal