Skip to content

A Parallel File Scanner

This section combines the ideas from the previous sections into one program structure.

This section combines the ideas from the previous sections into one program structure.

The program scans files in parallel.

Several worker threads receive file paths from a queue. Each worker opens a file, reads it, and updates shared statistics.

The design has four parts:

ComponentPurpose
job queuestores file paths
worker threadsprocess files
shared statestores totals
synchronizationcoordinates access

The shared state is small:

const State = struct {
    mutex: std.Thread.Mutex = .{},

    files_scanned: u64 = 0,
    bytes_read: u64 = 0,
};

The mutex protects the counters.

Workers update them through methods:

fn addFile(self: *State, bytes: u64) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.files_scanned += 1;
    self.bytes_read += bytes;
}

The queue stores paths:

const JobQueue = struct {
    mutex: std.Thread.Mutex = .{},
    condition: std.Thread.Condition = .{},

    jobs: [128][]const u8 = undefined,

    head: usize = 0,
    tail: usize = 0,
    count: usize = 0,

    shutdown: bool = false,
};

The queue uses a ring buffer.

Workers remove paths from the queue:

fn pop(self: *JobQueue) ?[]const u8

The producer inserts paths:

fn push(self: *JobQueue, path: []const u8) void

Workers repeatedly:

  1. take a path
  2. open the file
  3. read the file
  4. update statistics

The worker loop looks like this:

fn worker(
    queue: *JobQueue,
    state: *State,
) void {
    while (true) {
        const path = queue.pop() orelse break;

        processFile(path, state);
    }
}

The worker exits when pop returns null.

The file processing function does not hold the mutex while reading the file:

fn processFile(
    path: []const u8,
    state: *State,
) void {
    // open file
    // read file
    // compute size

    state.addFile(size);
}

Only the counter update is synchronized.

The expensive I/O happens outside the lock.

This is important.

If the mutex stayed locked during file reading, only one worker could read files at a time. Parallelism would disappear.

The main thread usually acts as the producer.

It walks the filesystem and pushes paths into the queue:

while (iterator.next()) |entry| {
    queue.push(entry.path);
}

After all jobs are added:

queue.mutex.lock();
queue.shutdown = true;
queue.condition.broadcast();
queue.mutex.unlock();

This wakes all workers and tells them no more jobs will arrive.

Each worker eventually exits:

const path = queue.pop() orelse break;

The main thread joins all workers:

thread.join();

Finally the totals are printed:

std.debug.print(
    "files = {d}, bytes = {d}\n",
    .{
        state.files_scanned,
        state.bytes_read,
    },
);

This design scales reasonably well because:

  • workers mostly operate independently
  • shared state is small
  • lock contention is low
  • expensive work happens outside locks

The queue is the synchronization point.

The statistics object is the aggregation point.

Everything else is local worker state.

This is a common concurrent structure:

producer -> queue -> workers -> shared results

The same model appears in:

  • search engine crawlers
  • compilers
  • build systems
  • backup tools
  • static analyzers
  • image processing pipelines

The details change, but the structure remains similar.

A good concurrent design usually has:

  • few shared objects
  • clear ownership
  • short lock durations
  • explicit shutdown rules
  • bounded queues
  • workers with independent local state

The threads themselves are not the important part.

The important part is how data moves through the system.

Exercise 18-35. Replace the fixed-size queue with a dynamically growing queue.

Exercise 18-36. Add a counter for failed file opens.

Exercise 18-37. Add recursive directory traversal.

Exercise 18-38. Add a second queue for completed results.

Exercise 18-39. Measure performance with one worker, two workers, and four workers.