# Data Races

### Data Races

A data race happens when two threads access the same memory at the same time, at least one access writes, and there is no proper synchronization.

That means this combination is dangerous:

```text
shared memory
multiple threads
at least one writer
no mutex, atomic, or other synchronization
```

Data races are one of the main reasons concurrent programs become unreliable.

#### A Small Race

This program has a race:

```zig
const std = @import("std");

var counter: u32 = 0;

fn worker() void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        counter += 1;
    }
}

pub fn main() !void {
    const t1 = try std.Thread.spawn(.{}, worker, .{});
    const t2 = try std.Thread.spawn(.{}, worker, .{});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter});
}
```

You may expect:

```text
counter = 2000
```

But the program may print a smaller number.

The statement:

```zig
counter += 1;
```

is not one safe shared operation. It reads the value, adds one, and writes the result back. Two threads can interfere with each other.

#### Reads Can Race Too

A common beginner mistake is thinking only writes need protection.

This is also unsafe:

```zig
// Thread A
counter += 1;

// Thread B
std.debug.print("counter = {}\n", .{counter});
```

Thread B only reads, but Thread A writes at the same time. That is still a race.

The rule is:

If data is shared and mutable, every access must follow the same synchronization rule.

#### Fix with a Mutex

Use a mutex when the shared data belongs to a larger protected state.

```zig
const std = @import("std");

const Counter = struct {
    mutex: std.Thread.Mutex = .{},
    value: u32 = 0,

    fn increment(self: *Counter) void {
        self.mutex.lock();
        defer self.mutex.unlock();

        self.value += 1;
    }

    fn get(self: *Counter) u32 {
        self.mutex.lock();
        defer self.mutex.unlock();

        return self.value;
    }
};

fn worker(counter: *Counter) void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        counter.increment();
    }
}

pub fn main() !void {
    var counter = Counter{};

    const t1 = try std.Thread.spawn(.{}, worker, .{&counter});
    const t2 = try std.Thread.spawn(.{}, worker, .{&counter});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter.get()});
}
```

Now both writing and reading go through methods that lock the mutex.

#### Fix with an Atomic

Use an atomic when the shared value is small and independent.

```zig
const std = @import("std");

var counter = std.atomic.Value(u32).init(0);

fn worker() void {
    var i: u32 = 0;

    while (i < 1000) : (i += 1) {
        _ = counter.fetchAdd(1, .seq_cst);
    }
}

pub fn main() !void {
    const t1 = try std.Thread.spawn(.{}, worker, .{});
    const t2 = try std.Thread.spawn(.{}, worker, .{});

    t1.join();
    t2.join();

    std.debug.print("counter = {}\n", .{counter.load(.seq_cst)});
}
```

This is safe because the increment is atomic.

For one counter, an atomic is fine. For a structure with several related fields, use a mutex.

#### Data Race vs Logic Race

A data race is about unsafe memory access.

A logic race is about wrong ordering.

This code may be free of data races but still wrong:

```zig
const std = @import("std");

var ready = std.atomic.Value(bool).init(false);
var value = std.atomic.Value(u32).init(0);

fn producer() void {
    ready.store(true, .seq_cst);
    value.store(42, .seq_cst);
}

fn consumer() void {
    while (!ready.load(.seq_cst)) {}

    const x = value.load(.seq_cst);
    std.debug.print("value = {}\n", .{x});
}
```

There is no plain unsynchronized shared access. But the producer stores `ready` before storing `value`.

The consumer may see `ready == true`, then read the old value.

The ordering is wrong.

Better:

```zig
fn producer() void {
    value.store(42, .seq_cst);
    ready.store(true, .seq_cst);
}
```

Now `ready` means the value has already been stored.

Synchronization protects memory. It does not automatically make your protocol correct.

#### Shared Mutable State Is the Source

Most races come from shared mutable state.

This means:

```text
more than one thread can reach the same data
and at least one thread can change it
```

There are three common ways to make this safe.

| Method | Idea |
|---|---|
| Ownership | Only one thread owns the data |
| Mutex | Threads share data, but lock before access |
| Atomic | Threads share one small value safely |

Ownership is often the simplest.

Instead of sharing one counter, give each thread its own counter, then combine the results after joining.

```zig
const std = @import("std");

fn count(result: *u32) void {
    var local: u32 = 0;

    var i: u32 = 0;
    while (i < 1000) : (i += 1) {
        local += 1;
    }

    result.* = local;
}

pub fn main() !void {
    var a: u32 = 0;
    var b: u32 = 0;

    const t1 = try std.Thread.spawn(.{}, count, .{&a});
    const t2 = try std.Thread.spawn(.{}, count, .{&b});

    t1.join();
    t2.join();

    const total = a + b;
    std.debug.print("total = {}\n", .{total});
}
```

Here, each thread writes to its own result. After both threads finish, the main thread reads both values.

No mutex is needed because there is no simultaneous access to the same variable.

#### Joining Creates a Boundary

After `join`, the worker thread has finished.

That means this is safe:

```zig
t1.join();
const result = a;
```

The main thread reads `a` only after the worker is done writing it.

Before `join`, reading the same value would be unsafe unless protected.

```zig
const result = a; // unsafe if worker may still write
t1.join();
```

Thread lifetime and memory access are connected.

#### Slices Can Hide Sharing

A slice does not own memory. It points to memory.

So two different slices may refer to the same array.

```zig
var buffer = [_]u8{ 0, 0, 0, 0 };

const left = buffer[0..2];
const also_left = buffer[0..2];
```

If two threads receive overlapping slices and one writes, they may race.

```zig
const t1 = try std.Thread.spawn(.{}, writeSlice, .{left});
const t2 = try std.Thread.spawn(.{}, writeSlice, .{also_left});
```

This is unsafe if both write to the same memory.

Separate slices are safe only if they do not overlap, or if access is synchronized.

#### Global Variables Are Risky

Global mutable variables are easy to reach from many threads.

```zig
var global_state: u32 = 0;
```

That makes ownership unclear.

Prefer passing state explicitly:

```zig
fn worker(state: *State) void {
    // use state
}
```

This does not automatically make the program safe, but it makes sharing visible.

A reader can see that several threads receive the same pointer:

```zig
const t1 = try std.Thread.spawn(.{}, worker, .{&state});
const t2 = try std.Thread.spawn(.{}, worker, .{&state});
```

Visible sharing is easier to review.

#### Avoid Mixed Access

Do not sometimes use a lock and sometimes skip it.

Bad:

```zig
fn increment(self: *Counter) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.value += 1;
}

fn reset(self: *Counter) void {
    self.value = 0; // no lock
}
```

The `reset` method breaks the rule.

Every access to `value` must use the same protection.

Better:

```zig
fn reset(self: *Counter) void {
    self.mutex.lock();
    defer self.mutex.unlock();

    self.value = 0;
}
```

A mutex is only useful if all code agrees to use it.

#### Keep Invariants Protected

A race can break relationships between fields.

Suppose this structure stores a queue:

```zig
const Queue = struct {
    items: []Job,
    len: usize,
};
```

You may have this invariant:

```text
len is the number of valid items
```

If one thread changes `items` while another reads `len`, the reader may see a broken state.

Use one mutex for the whole invariant:

```zig
const Queue = struct {
    mutex: std.Thread.Mutex = .{},
    items: []Job,
    len: usize,
};
```

The mutex protects the relationship, not just individual fields.

#### Data Races Are Often Intermittent

A race may not fail every time.

The program may work on your machine today and fail tomorrow.

It may fail only with more CPU cores.

It may fail only in release mode.

It may fail only under load.

That is why “it worked once” means very little for concurrent code.

A race depends on timing, and timing changes constantly.

#### Simple Review Checklist

Before sharing data between threads, ask:

```text
Who owns this data?
Can more than one thread reach it?
Can any thread write it?
What protects every access?
When does the data stop being used?
```

For every shared mutable value, you should be able to point to one clear rule:

```text
This field is protected by this mutex.
This counter is atomic.
This buffer is owned by this worker until join.
This queue owns its own synchronization.
```

If there is no clear rule, assume the code is unsafe.

#### The Main Rule

A data race is not a performance issue. It is a correctness issue.

Do not fix races by adding sleeps. Do not rely on output order. Do not assume a small test proves safety.

Use ownership when possible.

Use mutexes for shared state.

Use atomics for small independent values.

Use condition variables and queues to coordinate work.

The best concurrent code reduces sharing first, then synchronizes the sharing that remains.

