# Case Study 1: Reusing a Temporary Buffer

### Performance Case Studies

Performance ideas become clearer when you see them inside real code.

This section walks through several small case studies. Each one starts with simple code, finds the likely cost, and improves the design.

## Case Study 1: Reusing a Temporary Buffer

A common slow pattern is allocating temporary memory inside a loop.

```zig
fn processAll(allocator: std.mem.Allocator, items: []const Item) !void {
    for (items) |item| {
        const scratch = try allocator.alloc(u8, 4096);
        defer allocator.free(scratch);

        try processOne(item, scratch);
    }
}
```

This allocates once per item.

A better version allocates once:

```zig
fn processAll(allocator: std.mem.Allocator, items: []const Item) !void {
    const scratch = try allocator.alloc(u8, 4096);
    defer allocator.free(scratch);

    for (items) |item| {
        try processOne(item, scratch);
    }
}
```

The improvement is simple: move allocation out of the hot loop.

The important question is lifetime. If `processOne` only needs temporary memory during each call, one reused buffer is enough.

## Case Study 2: Avoiding Copies in a Parser

A parser often reads a large input buffer and extracts tokens.

Bad design:

```zig
const Token = struct {
    text: []u8,
};

fn makeToken(allocator: std.mem.Allocator, input: []const u8) !Token {
    return .{
        .text = try allocator.dupe(u8, input),
    };
}
```

This copies every token.

A better design stores a slice into the original input:

```zig
const Token = struct {
    text: []const u8,
};

fn makeToken(input: []const u8, start: usize, end: usize) Token {
    return .{
        .text = input[start..end],
    };
}
```

No allocation. No copy.

The cost is a lifetime rule: the original input must remain alive while the tokens are used.

That is usually acceptable for parsers. Read the file once, keep the buffer alive, and let tokens point into it.

## Case Study 3: Reserving ArrayList Capacity

Dynamic arrays grow when they need more room.

```zig
fn collect(allocator: std.mem.Allocator, values: []const u32) !std.ArrayList(u32) {
    var list = std.ArrayList(u32).init(allocator);

    for (values) |v| {
        try list.append(v);
    }

    return list;
}
```

This works, but `append` may reallocate several times.

If you know the final size, reserve capacity:

```zig
fn collect(allocator: std.mem.Allocator, values: []const u32) !std.ArrayList(u32) {
    var list = std.ArrayList(u32).init(allocator);
    errdefer list.deinit();

    try list.ensureTotalCapacity(values.len);

    for (values) |v| {
        try list.append(v);
    }

    return list;
}
```

Now the list has enough memory before the loop begins.

This reduces allocation count and avoids repeated copying during growth.

## Case Study 4: Hot and Cold Data

Suppose you have a user record:

```zig
const User = struct {
    id: u64,
    age: u8,
    name: [128]u8,
    email: [256]u8,
};
```

Now you often run this:

```zig
fn countAdults(users: []const User) usize {
    var count: usize = 0;

    for (users) |user| {
        if (user.age >= 18) {
            count += 1;
        }
    }

    return count;
}
```

The loop only needs `age`, but each `User` is large. The CPU may pull unrelated name and email data into cache.

A better layout separates frequently used fields:

```zig
const UserHot = struct {
    id: u64,
    age: u8,
};

const UserCold = struct {
    name: [128]u8,
    email: [256]u8,
};
```

Then keep parallel arrays or indexed records.

```zig
const Users = struct {
    hot: []UserHot,
    cold: []UserCold,
};
```

Now the adult-counting loop touches only compact hot data.

This can improve cache locality in large datasets.

## Case Study 5: Removing a Branch from a Hot Loop

Suppose you update active entities:

```zig
fn updateAll(entities: []Entity) void {
    for (entities) |*entity| {
        if (entity.active) {
            update(entity);
        }
    }
}
```

If active and inactive entities are mixed randomly, the branch may be unpredictable.

A better design stores active entities separately:

```zig
fn updateActive(active_entities: []Entity) void {
    for (active_entities) |*entity| {
        update(entity);
    }
}
```

Now the branch disappears.

This also improves locality because the loop only touches entities that need work.

The strongest branch optimization is often data organization.

## Case Study 6: Batch Output

Printing inside a loop is often slow.

```zig
for (items) |item| {
    try stdout.print("item: {}\n", .{item.id});
}
```

Each print may eventually cause expensive output work.

A better version buffers output:

```zig
var buffer = std.ArrayList(u8).init(allocator);
defer buffer.deinit();

try buffer.ensureTotalCapacity(items.len * 16);

for (items) |item| {
    try buffer.writer().print("item: {}\n", .{item.id});
}

try stdout.writeAll(buffer.items);
```

The program builds output in memory and writes it in one larger operation.

This reduces I/O overhead.

## Case Study 7: Stack Buffer Instead of Heap Allocation

Small fixed-size temporary memory often belongs on the stack.

Heap version:

```zig
fn formatId(allocator: std.mem.Allocator, id: u64) ![]u8 {
    return try std.fmt.allocPrint(allocator, "id={}", .{id});
}
```

Stack-buffer version:

```zig
fn formatId(buffer: []u8, id: u64) ![]u8 {
    return try std.fmt.bufPrint(buffer, "id={}", .{id});
}
```

Call it like this:

```zig
var buffer: [64]u8 = undefined;
const text = try formatId(buffer[0..], 1234);
```

No heap allocation is needed.

The caller owns the storage, and the function clearly says how memory is handled.

## Case Study 8: Arena for One Request

Suppose a server handles requests. Each request needs many temporary allocations.

Instead of freeing every object separately, use an arena per request.

```zig
fn handleRequest(parent_allocator: std.mem.Allocator, request: Request) !void {
    var arena = std.heap.ArenaAllocator.init(parent_allocator);
    defer arena.deinit();

    const allocator = arena.allocator();

    const parsed = try parseRequest(allocator, request.body);
    const result = try buildResponse(allocator, parsed);

    try sendResponse(result);
}
```

All temporary memory is freed together when the request ends.

This design is useful when many objects share the same lifetime.

The tradeoff is that memory is not released individually before the request finishes.

## Case Study 9: Compile-Time Configuration

Suppose logging is controlled by a runtime boolean:

```zig
fn process(debug: bool, value: i32) void {
    if (debug) {
        std.debug.print("value={}\n", .{value});
    }

    use(value);
}
```

If `debug` never changes during a build, make it compile-time:

```zig
fn process(comptime debug: bool, value: i32) void {
    if (debug) {
        std.debug.print("value={}\n", .{value});
    }

    use(value);
}
```

When called with `false`, the logging branch can disappear from generated code.

```zig
process(false, 42);
```

Compile-time knowledge can remove runtime cost.

## Case Study 10: Choosing the Right Representation

Suppose you represent a graph with heap-allocated nodes and pointers:

```zig
const Node = struct {
    value: u32,
    edges: []*Node,
};
```

This is flexible, but it may scatter memory across the heap.

A more compact representation uses arrays and indices:

```zig
const Edge = struct {
    to: usize,
};

const Node = struct {
    value: u32,
    first_edge: usize,
    edge_count: usize,
};
```

Edges live in one dense array.

```zig
const Graph = struct {
    nodes: []Node,
    edges: []Edge,
};
```

This representation can be faster for traversal because memory is contiguous.

It is also easier to serialize.

The tradeoff is that mutation may require more careful management.

## How to Think Through a Case

Most performance improvements follow the same pattern.

First, identify the hot path.

Then ask what cost appears there:

| Cost | Typical Fix |
|---|---|
| Repeated allocation | Reuse buffers, reserve capacity, use arenas |
| Large copies | Use slices, pointers, or caller-provided output |
| Cache misses | Use contiguous data and compact hot fields |
| Branch misprediction | Group data or remove branches from hot loops |
| I/O overhead | Batch reads and writes |
| Runtime decisions | Move stable choices to `comptime` |

Do not apply every fix everywhere.

Apply the fix that matches the measured bottleneck.

## Final Rule

Performance is not one trick.

It is a habit of seeing cost.

In Zig, the most important costs are usually visible:

Where does memory come from?

How many times is allocation called?

Is data copied or borrowed?

Is data contiguous?

Does this branch run millions of times?

Can this decision happen at compile time?

Once you can answer those questions, Zig gives you the tools to improve the program directly.

