# Parsing Numbers and Text

### Parsing Numbers and Text

Parsing means turning text into data.

For example, this text:

```text
123
```

can become the integer value:

```zig
123
```

This text:

```text
3.14
```

can become a floating point value:

```zig
3.14
```

This text:

```text
true
```

can become a boolean value:

```zig
true
```

Programs parse text all the time. Command-line arguments are text. Environment variables are text. Configuration files often contain text. Network protocols often begin as text. Logs, CSV files, JSON files, and source code are all text formats.

Zig keeps parsing explicit. Text does not automatically become a number. You ask for a specific type, and parsing can fail.

#### Parsing Integers

Use `std.fmt.parseInt` to parse an integer.

```zig
const std = @import("std");

pub fn main() !void {
    const text = "123";

    const value = try std.fmt.parseInt(u32, text, 10);

    std.debug.print("value = {}\n", .{value});
}
```

Output:

```text
value = 123
```

This call has three important parts:

```zig
std.fmt.parseInt(u32, text, 10)
```

`u32` is the integer type you want.

`text` is the input byte slice.

`10` is the base, also called the radix.

Base 10 means normal decimal numbers.

#### The Result Type Matters

This parses into `u8`:

```zig
const value = try std.fmt.parseInt(u8, "255", 10);
```

That works because `255` fits inside `u8`.

This fails:

```zig
const value = try std.fmt.parseInt(u8, "256", 10);
```

A `u8` can store values from `0` to `255`. The text `"256"` is too large.

Parsing checks this. Zig does not silently wrap the number.

#### Signed and Unsigned Integers

Unsigned integer types cannot store negative values.

```zig
const value = try std.fmt.parseInt(u32, "-1", 10);
```

This fails because `u32` cannot represent `-1`.

Use a signed type when negative values are valid:

```zig
const value = try std.fmt.parseInt(i32, "-1", 10);
```

Now parsing succeeds.

Choose the type based on the meaning of the value.

A count might be `usize`.

An ID might be `u64`.

A temperature might be `i32`.

A small byte value might be `u8`.

#### Different Bases

The third argument to `parseInt` is the base.

Decimal:

```zig
const a = try std.fmt.parseInt(u32, "255", 10);
```

Hexadecimal:

```zig
const b = try std.fmt.parseInt(u32, "ff", 16);
```

Binary:

```zig
const c = try std.fmt.parseInt(u32, "11111111", 2);
```

All three produce the number `255`.

This is useful when working with file formats, byte flags, colors, permissions, and machine-level data.

Example:

```zig
const std = @import("std");

pub fn main() !void {
    const dec = try std.fmt.parseInt(u32, "255", 10);
    const hex = try std.fmt.parseInt(u32, "ff", 16);
    const bin = try std.fmt.parseInt(u32, "11111111", 2);

    std.debug.print("{} {} {}\n", .{ dec, hex, bin });
}
```

Output:

```text
255 255 255
```

#### Handling Parse Errors

Parsing can fail.

The text might contain invalid characters:

```text
12x3
```

The number might be too large:

```text
999999999999999999999999999999
```

The text might be empty:

```text

```

You can handle errors with `catch`.

```zig
const std = @import("std");

pub fn main() void {
    const text = "12x3";

    const value = std.fmt.parseInt(u32, text, 10) catch |err| {
        std.debug.print("could not parse integer: {}\n", .{err});
        return;
    };

    std.debug.print("value = {}\n", .{value});
}
```

Output:

```text
could not parse integer: error.InvalidCharacter
```

In real tools, you often print a clearer message:

```zig
std.debug.print("expected a decimal number, got {s}\n", .{text});
```

The raw error is useful for debugging. A human message is better for users.

#### Parsing Floating Point Numbers

Use `std.fmt.parseFloat` for floating point values.

```zig
const std = @import("std");

pub fn main() !void {
    const text = "3.14";

    const value = try std.fmt.parseFloat(f64, text);

    std.debug.print("value = {}\n", .{value});
}
```

Output:

```text
value = 3.14
```

The first argument is the float type:

```zig
f32
```

or:

```zig
f64
```

Use `f64` by default unless you have a reason to use `f32`.

#### Parsing Text with Spaces

Parsing usually expects the input slice to contain only the number.

This may fail:

```zig
const value = try std.fmt.parseInt(u32, " 123 ", 10);
```

The spaces are part of the input.

Trim the text first:

```zig
const std = @import("std");

pub fn main() !void {
    const raw = " 123 \n";

    const text = std.mem.trim(u8, raw, " \n\r\t");

    const value = try std.fmt.parseInt(u32, text, 10);

    std.debug.print("value = {}\n", .{value});
}
```

`std.mem.trim` removes matching bytes from both ends.

This call:

```zig
std.mem.trim(u8, raw, " \n\r\t")
```

removes spaces, newlines, carriage returns, and tabs.

#### Splitting Text

Many text formats contain separators.

This line:

```text
alice,30
```

has two fields separated by a comma.

You can split it:

```zig
const std = @import("std");

pub fn main() !void {
    const line = "alice,30";

    var it = std.mem.splitScalar(u8, line, ',');

    const name = it.next() orelse return error.InvalidInput;
    const age_text = it.next() orelse return error.InvalidInput;

    const age = try std.fmt.parseInt(u32, age_text, 10);

    std.debug.print("name={s}, age={}\n", .{ name, age });
}
```

Output:

```text
name=alice, age=30
```

The iterator gives one part at a time.

First call:

```zig
it.next()
```

returns `"alice"`.

Second call returns `"30"`.

If a part is missing, we return an error.

#### Checking for Extra Fields

The input:

```text
alice,30,admin
```

has an extra field.

Sometimes that should be an error.

```zig
if (it.next() != null) {
    return error.InvalidInput;
}
```

Now the parser accepts exactly two fields.

Complete example:

```zig
const std = @import("std");

pub fn main() !void {
    const line = "alice,30";

    var it = std.mem.splitScalar(u8, line, ',');

    const name = it.next() orelse return error.InvalidInput;
    const age_text = it.next() orelse return error.InvalidInput;

    if (it.next() != null) {
        return error.InvalidInput;
    }

    const age = try std.fmt.parseInt(u32, age_text, 10);

    std.debug.print("name={s}, age={}\n", .{ name, age });
}
```

This habit matters. A parser should define what it accepts and what it rejects.

#### Parsing Lines

A file often contains many lines.

```text
alice,30
bob,25
charlie,40
```

You can split by newline first, then parse each line.

```zig
const std = @import("std");

fn parseLine(line: []const u8) !void {
    var it = std.mem.splitScalar(u8, line, ',');

    const name = it.next() orelse return error.InvalidInput;
    const age_text = it.next() orelse return error.InvalidInput;

    if (it.next() != null) {
        return error.InvalidInput;
    }

    const age = try std.fmt.parseInt(u32, age_text, 10);

    std.debug.print("name={s}, age={}\n", .{ name, age });
}

pub fn main() !void {
    const text =
        \\alice,30
        \\bob,25
        \\charlie,40
    ;

    var lines = std.mem.splitScalar(u8, text, '\n');

    while (lines.next()) |line| {
        if (line.len == 0) continue;
        try parseLine(line);
    }
}
```

Output:

```text
name=alice, age=30
name=bob, age=25
name=charlie, age=40
```

This example uses a multiline string:

```zig
const text =
    \\alice,30
    \\bob,25
    \\charlie,40
;
```

Each line begins with `\\`.

#### Trimming Each Field

Real input often has spaces:

```text
alice, 30
```

The age field is:

```text
 30
```

Trim it before parsing:

```zig
const age_clean = std.mem.trim(u8, age_text, " \t\r\n");
const age = try std.fmt.parseInt(u32, age_clean, 10);
```

You may also trim the name:

```zig
const name_clean = std.mem.trim(u8, name, " \t\r\n");
```

Full parser:

```zig
const std = @import("std");

fn parseLine(line: []const u8) !void {
    var it = std.mem.splitScalar(u8, line, ',');

    const raw_name = it.next() orelse return error.InvalidInput;
    const raw_age = it.next() orelse return error.InvalidInput;

    if (it.next() != null) {
        return error.InvalidInput;
    }

    const name = std.mem.trim(u8, raw_name, " \t\r\n");
    const age_text = std.mem.trim(u8, raw_age, " \t\r\n");

    const age = try std.fmt.parseInt(u32, age_text, 10);

    std.debug.print("name={s}, age={}\n", .{ name, age });
}

pub fn main() !void {
    try parseLine("alice, 30");
}
```

#### Parsing Booleans

The standard library has helpers for many kinds of parsing, but boolean parsing is simple enough to write directly.

```zig
const std = @import("std");

fn parseBool(text: []const u8) !bool {
    if (std.mem.eql(u8, text, "true")) return true;
    if (std.mem.eql(u8, text, "false")) return false;
    return error.InvalidBoolean;
}

pub fn main() !void {
    const value = try parseBool("true");

    std.debug.print("{}\n", .{value});
}
```

This function accepts exactly:

```text
true
```

and:

```text
false
```

It rejects everything else.

That is a good parser design. Be clear about accepted input.

#### Parsing Key-Value Text

Many simple config formats look like this:

```text
host=localhost
port=8080
debug=true
```

A line parser can split on `=`.

```zig
const std = @import("std");

fn parseBool(text: []const u8) !bool {
    if (std.mem.eql(u8, text, "true")) return true;
    if (std.mem.eql(u8, text, "false")) return false;
    return error.InvalidBoolean;
}

pub fn main() !void {
    const line = "port=8080";

    var it = std.mem.splitScalar(u8, line, '=');

    const key = it.next() orelse return error.InvalidInput;
    const value_text = it.next() orelse return error.InvalidInput;

    if (it.next() != null) {
        return error.InvalidInput;
    }

    if (std.mem.eql(u8, key, "port")) {
        const port = try std.fmt.parseInt(u16, value_text, 10);
        std.debug.print("port = {}\n", .{port});
    }
}
```

This is not a full configuration parser. It is the beginning of one.

The core idea is simple:

split text

validate the number of fields

trim fields if needed

compare keys

parse values into specific types

#### Case Sensitivity

String comparison is usually case-sensitive.

```zig
std.mem.eql(u8, "true", "true")
```

is true.

```zig
std.mem.eql(u8, "true", "True")
```

is false.

That may be exactly what you want. Strict formats are easier to test and document.

If you want to accept multiple spellings, write that policy explicitly:

```zig
if (std.mem.eql(u8, text, "true")) return true;
if (std.mem.eql(u8, text, "True")) return true;
if (std.mem.eql(u8, text, "1")) return true;
```

Do not let accepted input grow accidentally. A parser is part of your program’s interface.

#### Avoid Silent Defaults

A tempting mistake is to return a default value when parsing fails.

Bad idea:

```zig
fn parsePort(text: []const u8) u16 {
    return std.fmt.parseInt(u16, text, 10) catch 8080;
}
```

This hides bad input.

If the user writes:

```text
port=eighty
```

the program silently uses `8080`.

That can be dangerous.

Prefer returning an error:

```zig
fn parsePort(text: []const u8) !u16 {
    return std.fmt.parseInt(u16, text, 10);
}
```

Then the caller can decide what to do.

Defaults are fine when they are intentional, but do not use defaults to hide parse failures.

#### A Small Complete Parser

Here is a small parser for this input:

```text
name=alice
age=30
debug=true
```

It fills this struct:

```zig
const Config = struct {
    name: []const u8,
    age: u32,
    debug: bool,
};
```

Full code:

```zig
const std = @import("std");

const Config = struct {
    name: []const u8,
    age: u32,
    debug: bool,
};

fn parseBool(text: []const u8) !bool {
    if (std.mem.eql(u8, text, "true")) return true;
    if (std.mem.eql(u8, text, "false")) return false;
    return error.InvalidBoolean;
}

fn parseConfig(text: []const u8) !Config {
    var config = Config{
        .name = "",
        .age = 0,
        .debug = false,
    };

    var lines = std.mem.splitScalar(u8, text, '\n');

    while (lines.next()) |raw_line| {
        const line = std.mem.trim(u8, raw_line, " \t\r\n");
        if (line.len == 0) continue;

        var parts = std.mem.splitScalar(u8, line, '=');

        const raw_key = parts.next() orelse return error.InvalidInput;
        const raw_value = parts.next() orelse return error.InvalidInput;

        if (parts.next() != null) {
            return error.InvalidInput;
        }

        const key = std.mem.trim(u8, raw_key, " \t\r\n");
        const value = std.mem.trim(u8, raw_value, " \t\r\n");

        if (std.mem.eql(u8, key, "name")) {
            config.name = value;
        } else if (std.mem.eql(u8, key, "age")) {
            config.age = try std.fmt.parseInt(u32, value, 10);
        } else if (std.mem.eql(u8, key, "debug")) {
            config.debug = try parseBool(value);
        } else {
            return error.UnknownKey;
        }
    }

    return config;
}

pub fn main() !void {
    const text =
        \\name=alice
        \\age=30
        \\debug=true
    ;

    const config = try parseConfig(text);

    std.debug.print("name={s}\n", .{config.name});
    std.debug.print("age={}\n", .{config.age});
    std.debug.print("debug={}\n", .{config.debug});
}
```

Output:

```text
name=alice
age=30
debug=true
```

This parser is small, but it shows the right habits.

It trims whitespace.

It rejects malformed lines.

It rejects unknown keys.

It parses values into specific types.

It returns errors instead of guessing.

#### What You Should Remember

Parsing turns text into typed data.

`std.fmt.parseInt` parses integers.

`std.fmt.parseFloat` parses floating point numbers.

Parsing can fail, so use `try` or `catch`.

Choose the output type deliberately.

Use `std.mem.trim` to remove whitespace.

Use `std.mem.splitScalar` to split simple text formats.

Validate the number of fields.

Reject invalid input clearly.

Do not silently replace bad input with defaults.

Good parsing code is strict, explicit, and easy to test.