Skip to content

Thread Pools

Creating a thread is expensive.

Creating a thread is expensive.

A thread needs:

  • operating system resources
  • stack memory
  • scheduler state

Programs that create thousands of short-lived threads often spend more time managing threads than doing useful work.

A thread pool solves this problem.

A thread pool creates a fixed set of worker threads once. Work items are placed into a queue. Workers repeatedly take work from the queue and execute it.

The model looks like this:

jobs -> queue -> worker threads

The queue is shared.

Workers sleep while the queue is empty.

Workers wake when new jobs arrive.

Here is a small thread pool with a fixed-size queue.

const std = @import("std");

const Job = struct {
    value: u32,
};

const Queue = struct {
    mutex: std.Thread.Mutex = .{},
    condition: std.Thread.Condition = .{},

    jobs: [16]Job = undefined,
    head: usize = 0,
    tail: usize = 0,
    count: usize = 0,

    shutdown: bool = false,

    fn push(self: *Queue, job: Job) void {
        self.mutex.lock();
        defer self.mutex.unlock();

        while (self.count == self.jobs.len) {
            self.condition.wait(&self.mutex);
        }

        self.jobs[self.tail] = job;
        self.tail = (self.tail + 1) % self.jobs.len;
        self.count += 1;

        self.condition.signal();
    }

    fn pop(self: *Queue) ?Job {
        self.mutex.lock();
        defer self.mutex.unlock();

        while (self.count == 0 and !self.shutdown) {
            self.condition.wait(&self.mutex);
        }

        if (self.count == 0 and self.shutdown) {
            return null;
        }

        const job = self.jobs[self.head];

        self.head = (self.head + 1) % self.jobs.len;
        self.count -= 1;

        self.condition.signal();

        return job;
    }
};

fn worker(queue: *Queue, id: u32) void {
    while (true) {
        const job = queue.pop() orelse break;

        std.debug.print(
            "worker {d}: job {d}\n",
            .{ id, job.value },
        );
    }
}

pub fn main() !void {
    var queue = Queue{};

    const t1 = try std.Thread.spawn(
        .{},
        worker,
        .{ &queue, 1 },
    );

    const t2 = try std.Thread.spawn(
        .{},
        worker,
        .{ &queue, 2 },
    );

    var i: u32 = 1;

    while (i <= 10) : (i += 1) {
        queue.push(.{ .value = i });
    }

    queue.mutex.lock();
    queue.shutdown = true;
    queue.condition.broadcast();
    queue.mutex.unlock();

    t1.join();
    t2.join();
}

The queue stores jobs in a ring buffer:

jobs: [16]Job
head: usize
tail: usize
count: usize

head points to the next item to remove.

tail points to the next position to insert.

The queue wraps around using modulo arithmetic:

self.tail = (self.tail + 1) % self.jobs.len;

The producer inserts jobs with push.

The workers remove jobs with pop.

If the queue is empty, workers wait:

while (self.count == 0 and !self.shutdown) {
    self.condition.wait(&self.mutex);
}

If the queue is full, producers wait:

while (self.count == self.jobs.len) {
    self.condition.wait(&self.mutex);
}

This prevents overflow and busy waiting.

The shutdown flag tells workers to exit:

shutdown: bool = false

When shutdown begins:

queue.shutdown = true;
queue.condition.broadcast();

broadcast wakes all waiting workers.

Each worker checks the shutdown condition:

if (self.count == 0 and self.shutdown) {
    return null;
}

Returning null tells the worker loop to stop.

The thread pool model is useful when:

  • many small jobs exist
  • jobs are independent
  • thread creation cost matters
  • work arrives continuously

Examples:

  • HTTP request handling
  • image processing
  • background indexing
  • file scanning
  • build systems

A pool should usually have a limited number of workers.

Too many workers can reduce performance because:

  • threads compete for CPU time
  • cache locality becomes worse
  • synchronization overhead increases

For CPU-bound work, the worker count is often close to the number of CPU cores.

For I/O-bound work, more workers may be reasonable because threads spend time waiting on I/O.

A thread pool is still explicit concurrency.

The queue is visible.

The synchronization is visible.

The worker lifetime is visible.

Nothing happens implicitly.

Exercise 18-21. Change the queue size from 16 to 64.

Exercise 18-22. Add a third worker thread.

Exercise 18-23. Add a second producer thread.

Exercise 18-24. Change the worker so it sleeps briefly after each job. Observe how the work distribution changes.

Exercise 18-25. Modify the job structure so each job contains two numbers. Make the worker print their sum.