A Parallel File Scanner

This section combines the ideas from the previous sections into one program structure.

The program scans files in parallel.

Several worker threads receive file paths from a queue. Each worker opens a file, reads it, and updates shared statistics.

The design has four parts:

Component	Purpose
job queue	stores file paths
worker threads	process files
shared state	stores totals
synchronization	coordinates access

The shared state is small:

const State = struct {
    mutex: std.Thread.Mutex = .{},

    files_scanned: u64 = 0,
    bytes_read: u64 = 0,
};

The mutex protects the counters.

Workers update them through methods:

fn addFile(self: *State, bytes: u64) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.files_scanned += 1;
    self.bytes_read += bytes;
}

The queue stores paths:

const JobQueue = struct {
    mutex: std.Thread.Mutex = .{},
    condition: std.Thread.Condition = .{},

    jobs: [128][]const u8 = undefined,

    head: usize = 0,
    tail: usize = 0,
    count: usize = 0,

    shutdown: bool = false,
};

The queue uses a ring buffer.

Workers remove paths from the queue:

fn pop(self: *JobQueue) ?[]const u8

The producer inserts paths:

fn push(self: *JobQueue, path: []const u8) void

Workers repeatedly:

take a path
open the file
read the file
update statistics

The worker loop looks like this:

fn worker(
    queue: *JobQueue,
    state: *State,
) void {
    while (true) {
        const path = queue.pop() orelse break;

        processFile(path, state);
    }
}

The worker exits when pop returns null.

The file processing function does not hold the mutex while reading the file:

fn processFile(
    path: []const u8,
    state: *State,
) void {
    // open file
    // read file
    // compute size

    state.addFile(size);
}

Only the counter update is synchronized.

The expensive I/O happens outside the lock.

This is important.

If the mutex stayed locked during file reading, only one worker could read files at a time. Parallelism would disappear.

The main thread usually acts as the producer.

It walks the filesystem and pushes paths into the queue:

while (iterator.next()) |entry| {
    queue.push(entry.path);
}

After all jobs are added:

queue.mutex.lock();
queue.shutdown = true;
queue.condition.broadcast();
queue.mutex.unlock();

This wakes all workers and tells them no more jobs will arrive.

Each worker eventually exits:

const path = queue.pop() orelse break;

The main thread joins all workers:

thread.join();

Finally the totals are printed:

std.debug.print(
    "files = {d}, bytes = {d}\n",
    .{
        state.files_scanned,
        state.bytes_read,
    },
);

This design scales reasonably well because:

workers mostly operate independently
shared state is small
lock contention is low
expensive work happens outside locks

The queue is the synchronization point.

The statistics object is the aggregation point.

Everything else is local worker state.

This is a common concurrent structure:

producer -> queue -> workers -> shared results

The same model appears in:

search engine crawlers
compilers
build systems
backup tools
static analyzers
image processing pipelines

The details change, but the structure remains similar.

A good concurrent design usually has:

few shared objects
clear ownership
short lock durations
explicit shutdown rules
bounded queues
workers with independent local state

The threads themselves are not the important part.

The important part is how data moves through the system.

Exercise 18-35. Replace the fixed-size queue with a dynamically growing queue.

Exercise 18-36. Add a counter for failed file opens.

Exercise 18-37. Add recursive directory traversal.

Exercise 18-38. Add a second queue for completed results.

Exercise 18-39. Measure performance with one worker, two workers, and four workers.