# Strings Are Bytes

### Strings Are Bytes

A Zig string is a sequence of bytes.

This is a string literal:

```zig
const s = "hello";
```

It has five visible characters:

```text
h e l l o
```

It also has a zero sentinel after the last byte, so the literal can be used where sentinel-terminated data is required.

The bytes can be printed one by one:

```zig
const std = @import("std");

pub fn main() void {
    const s = "hello";

    for (s) |b| {
        std.debug.print("{d}\n", .{b});
    }
}
```

The output is:

```text
104
101
108
108
111
```

These are byte values. The letter `h` is byte 104. The letter `e` is byte 101.

To print them as characters, use `{c}`:

```zig
const std = @import("std");

pub fn main() void {
    const s = "hello";

    for (s) |b| {
        std.debug.print("{c}\n", .{b});
    }
}
```

The output is:

```text
h
e
l
l
o
```

A string literal is not a special string object. Zig has no hidden string class. A string literal is a pointer to a constant sentinel-terminated array of bytes.

In ordinary code, it is often used as a slice:

```zig
const s: []const u8 = "hello";
```

The type `[]const u8` means a read-only slice of bytes.

This is the most common string type in Zig.

```zig
const std = @import("std");

fn printString(s: []const u8) void {
    std.debug.print("{s}\n", .{s});
}

pub fn main() void {
    printString("zig");
    printString("language");
}
```

The output is:

```text
zig
language
```

The `{s}` format prints a byte slice as a string.

Since strings are bytes, `s.len` gives the number of bytes, not the number of human characters.

```zig
const std = @import("std");

pub fn main() void {
    const s = "hello";

    std.debug.print("{d}\n", .{s.len});
}
```

The output is:

```text
5
```

For plain ASCII text, the number of bytes and the number of characters are the same.

For UTF-8 text, they may differ.

```zig
const std = @import("std");

pub fn main() void {
    const s = "é";

    std.debug.print("{d}\n", .{s.len});
}
```

The output is:

```text
2
```

The character `é` is encoded as two bytes in UTF-8.

This is important. Indexing a string gives a byte, not a character.

```zig
const std = @import("std");

pub fn main() void {
    const s = "é";

    std.debug.print("{d}\n", .{s[0]});
    std.debug.print("{d}\n", .{s[1]});
}
```

The output is:

```text
195
169
```

These are the two UTF-8 bytes for `é`.

For byte-oriented work, this is exactly what you want. Files, network protocols, and memory buffers are byte sequences.

For text-oriented work, you must decode UTF-8 deliberately.

String literals may contain escapes:

```zig
const newline = "first\nsecond";
const tab = "a\tb";
const quote = "he said \"zig\"";
const slash = "c:\\tmp\\file.txt";
```

A string may also be written across several lines with backslash-backslash syntax:

```zig
const text =
    \\first line
    \\second line
    \\third line
;
```

This produces the bytes for:

```text
first line
second line
third line
```

Multi-line strings are useful for help text, generated source, and test data.

A mutable string needs mutable storage. A string literal is constant and must not be changed.

```zig
var buf = [_]u8{ 'h', 'e', 'l', 'l', 'o' };

buf[0] = 'H';
```

Now `buf` contains:

```text
Hello
```

To pass it to a function that expects a string slice, use slicing:

```zig
const std = @import("std");

pub fn main() void {
    var buf = [_]u8{ 'h', 'e', 'l', 'l', 'o' };

    buf[0] = 'H';

    std.debug.print("{s}\n", .{buf[0..]});
}
```

The output is:

```text
Hello
```

Use `[]const u8` for read-only strings. Use `[]u8` for mutable byte buffers.

Exercises.

Exercise 6-17. Write a program that prints the byte values of `"zig"`.

Exercise 6-18. Write a function that takes `[]const u8` and prints each byte as a character.

Exercise 6-19. Print the `.len` of `"hello"` and `"é"`.

Exercise 6-20. Create a mutable byte array containing `hello`, change it to `Hello`, and print it.

