Skip to content

Data Races

A data race happens when two threads access the same memory at the same time, at least one access writes, and there is no proper synchronization.

A data race happens when two threads access the same memory at the same time, at least one access writes, and there is no proper synchronization.

That means this combination is dangerous:

shared memory
multiple threads
at least one writer
no mutex, atomic, or other synchronization

Data races are one of the main reasons concurrent programs become unreliable.

A Small Race

This program has a race:

const std = @import("std");

var counter: u32 = 0;

fn worker() void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        counter += 1;
    }
}

pub fn main() !void {
    const t1 = try std.Thread.spawn(.{}, worker, .{});
    const t2 = try std.Thread.spawn(.{}, worker, .{});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter});
}

You may expect:

counter = 2000

But the program may print a smaller number.

The statement:

counter += 1;

is not one safe shared operation. It reads the value, adds one, and writes the result back. Two threads can interfere with each other.

Reads Can Race Too

A common beginner mistake is thinking only writes need protection.

This is also unsafe:

// Thread A
counter += 1;

// Thread B
std.debug.print("counter = {}\n", .{counter});

Thread B only reads, but Thread A writes at the same time. That is still a race.

The rule is:

If data is shared and mutable, every access must follow the same synchronization rule.

Fix with a Mutex

Use a mutex when the shared data belongs to a larger protected state.

const std = @import("std");

const Counter = struct {
    mutex: std.Thread.Mutex = .{},
    value: u32 = 0,

    fn increment(self: *Counter) void {
        self.mutex.lock();
        defer self.mutex.unlock();

        self.value += 1;
    }

    fn get(self: *Counter) u32 {
        self.mutex.lock();
        defer self.mutex.unlock();

        return self.value;
    }
};

fn worker(counter: *Counter) void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        counter.increment();
    }
}

pub fn main() !void {
    var counter = Counter{};

    const t1 = try std.Thread.spawn(.{}, worker, .{&counter});
    const t2 = try std.Thread.spawn(.{}, worker, .{&counter});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter.get()});
}

Now both writing and reading go through methods that lock the mutex.

Fix with an Atomic

Use an atomic when the shared value is small and independent.

const std = @import("std");

var counter = std.atomic.Value(u32).init(0);

fn worker() void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        _ = counter.fetchAdd(1, .seq_cst);
    }
}

pub fn main() !void {
    const t1 = try std.Thread.spawn(.{}, worker, .{});
    const t2 = try std.Thread.spawn(.{}, worker, .{});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter.load(.seq_cst)});
}

This is safe because the increment is atomic.

For one counter, an atomic is fine. For a structure with several related fields, use a mutex.

Data Race vs Logic Race

A data race is about unsafe memory access.

A logic race is about wrong ordering.

This code may be free of data races but still wrong:

const std = @import("std");

var ready = std.atomic.Value(bool).init(false);
var value = std.atomic.Value(u32).init(0);

fn producer() void {
    ready.store(true, .seq_cst);
    value.store(42, .seq_cst);
}

fn consumer() void {
    while (!ready.load(.seq_cst)) {}

    const x = value.load(.seq_cst);
    std.debug.print("value = {}\n", .{x});
}

There is no plain unsynchronized shared access. But the producer stores ready before storing value.

The consumer may see ready == true, then read the old value.

The ordering is wrong.

Better:

fn producer() void {
    value.store(42, .seq_cst);
    ready.store(true, .seq_cst);
}

Now ready means the value has already been stored.

Synchronization protects memory. It does not automatically make your protocol correct.

Shared Mutable State Is the Source

Most races come from shared mutable state.

This means:

more than one thread can reach the same data
and at least one thread can change it

There are three common ways to make this safe.

MethodIdea
OwnershipOnly one thread owns the data
MutexThreads share data, but lock before access
AtomicThreads share one small value safely

Ownership is often the simplest.

Instead of sharing one counter, give each thread its own counter, then combine the results after joining.

const std = @import("std");

fn count(result: *u32) void {
    var local: u32 = 0;

    var i: u32 = 0;
    while (i < 1000) : (i += 1) {
        local += 1;
    }

    result.* = local;
}

pub fn main() !void {
    var a: u32 = 0;
    var b: u32 = 0;

    const t1 = try std.Thread.spawn(.{}, count, .{&a});
    const t2 = try std.Thread.spawn(.{}, count, .{&b});

    t1.join();
    t2.join();

    const total = a + b;
    std.debug.print("total = {}\n", .{total});
}

Here, each thread writes to its own result. After both threads finish, the main thread reads both values.

No mutex is needed because there is no simultaneous access to the same variable.

Joining Creates a Boundary

After join, the worker thread has finished.

That means this is safe:

t1.join();
const result = a;

The main thread reads a only after the worker is done writing it.

Before join, reading the same value would be unsafe unless protected.

const result = a; // unsafe if worker may still write
t1.join();

Thread lifetime and memory access are connected.

Slices Can Hide Sharing

A slice does not own memory. It points to memory.

So two different slices may refer to the same array.

var buffer = [_]u8{ 0, 0, 0, 0 };

const left = buffer[0..2];
const also_left = buffer[0..2];

If two threads receive overlapping slices and one writes, they may race.

const t1 = try std.Thread.spawn(.{}, writeSlice, .{left});
const t2 = try std.Thread.spawn(.{}, writeSlice, .{also_left});

This is unsafe if both write to the same memory.

Separate slices are safe only if they do not overlap, or if access is synchronized.

Global Variables Are Risky

Global mutable variables are easy to reach from many threads.

var global_state: u32 = 0;

That makes ownership unclear.

Prefer passing state explicitly:

fn worker(state: *State) void {
    // use state
}

This does not automatically make the program safe, but it makes sharing visible.

A reader can see that several threads receive the same pointer:

const t1 = try std.Thread.spawn(.{}, worker, .{&state});
const t2 = try std.Thread.spawn(.{}, worker, .{&state});

Visible sharing is easier to review.

Avoid Mixed Access

Do not sometimes use a lock and sometimes skip it.

Bad:

fn increment(self: *Counter) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.value += 1;
}

fn reset(self: *Counter) void {
    self.value = 0; // no lock
}

The reset method breaks the rule.

Every access to value must use the same protection.

Better:

fn reset(self: *Counter) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.value = 0;
}

A mutex is only useful if all code agrees to use it.

Keep Invariants Protected

A race can break relationships between fields.

Suppose this structure stores a queue:

const Queue = struct {
    items: []Job,
    len: usize,
};

You may have this invariant:

len is the number of valid items

If one thread changes items while another reads len, the reader may see a broken state.

Use one mutex for the whole invariant:

const Queue = struct {
    mutex: std.Thread.Mutex = .{},
    items: []Job,
    len: usize,
};

The mutex protects the relationship, not just individual fields.

Data Races Are Often Intermittent

A race may not fail every time.

The program may work on your machine today and fail tomorrow.

It may fail only with more CPU cores.

It may fail only in release mode.

It may fail only under load.

That is why “it worked once” means very little for concurrent code.

A race depends on timing, and timing changes constantly.

Simple Review Checklist

Before sharing data between threads, ask:

Who owns this data?
Can more than one thread reach it?
Can any thread write it?
What protects every access?
When does the data stop being used?

For every shared mutable value, you should be able to point to one clear rule:

This field is protected by this mutex.
This counter is atomic.
This buffer is owned by this worker until join.
This queue owns its own synchronization.

If there is no clear rule, assume the code is unsafe.

The Main Rule

A data race is not a performance issue. It is a correctness issue.

Do not fix races by adding sleeps. Do not rely on output order. Do not assume a small test proves safety.

Use ownership when possible.

Use mutexes for shared state.

Use atomics for small independent values.

Use condition variables and queues to coordinate work.

The best concurrent code reduces sharing first, then synchronizes the sharing that remains.