# A Line Filter

### A Line Filter

A line filter reads text, changes or selects some lines, and writes the result. Many Unix programs have this shape.

This program prints only the lines that contain a given word.

```text
filter needle input.txt
```

Here is a first version.

```zig
const std = @import("std");

pub fn main() !void {
    var args = std.process.args();

    _ = args.next();

    const needle = args.next() orelse {
        std.debug.print("missing search text\n", .{});
        return;
    };

    const path = args.next() orelse {
        std.debug.print("missing input file\n", .{});
        return;
    };

    const cwd = std.fs.cwd();

    var file = try cwd.openFile(path, .{});
    defer file.close();

    var buffer: [4096]u8 = undefined;

    while (try file.reader().readUntilDelimiterOrEof(&buffer, '\n')) |line| {
        if (std.mem.indexOf(u8, line, needle) != null) {
            try std.io.getStdOut().writer().print("{s}\n", .{line});
        }
    }
}
```

The program reads two arguments. The first is the text to search for. The second is the file name.

```zig
const needle = args.next() orelse {
    std.debug.print("missing search text\n", .{});
    return;
};

const path = args.next() orelse {
    std.debug.print("missing input file\n", .{});
    return;
};
```

The file is opened in the current directory.

```zig
var file = try cwd.openFile(path, .{});
defer file.close();
```

The buffer holds one line at a time.

```zig
var buffer: [4096]u8 = undefined;
```

This means a line longer than 4096 bytes cannot be read by this version. That is an intentional limit. Small programs should make their limits visible.

The loop reads one line on each pass.

```zig
while (try file.reader().readUntilDelimiterOrEof(&buffer, '\n')) |line| {
    ...
}
```

The call returns an optional slice. If a line is read, the loop body receives it as `line`. At end of file, the value is `null` and the loop stops.

The test is a substring search.

```zig
if (std.mem.indexOf(u8, line, needle) != null) {
    ...
}
```

`std.mem.indexOf` returns an optional index. If the result is not `null`, the line contains the search text.

The output is written to standard output.

```zig
try std.io.getStdOut().writer().print("{s}\n", .{line});
```

This program is useful, but it has two rough edges. It creates a new reader and writer expression inside the loop, and it always adds a newline even if the last line in the file had none.

We can clean up the first point by naming the reader and writer.

```zig
const std = @import("std");

pub fn main() !void {
    var args = std.process.args();

    _ = args.next();

    const needle = args.next() orelse return error.MissingNeedle;
    const path = args.next() orelse return error.MissingPath;

    const cwd = std.fs.cwd();

    var file = try cwd.openFile(path, .{});
    defer file.close();

    var reader = file.reader();
    var out = std.io.getStdOut().writer();

    var buffer: [4096]u8 = undefined;

    while (try reader.readUntilDelimiterOrEof(&buffer, '\n')) |line| {
        if (std.mem.indexOf(u8, line, needle) != null) {
            try out.print("{s}\n", .{line});
        }
    }
}
```

This is the same program, but the main loop is easier to read.

A line filter often has this structure:

```zig
while (try reader.readUntilDelimiterOrEof(&buffer, '\n')) |line| {
    if (keep(line)) {
        try write(line);
    }
}
```

The work is divided into three parts: read a line, decide whether to keep it, and write it.

The decision can be moved into a function.

```zig
fn contains(line: []const u8, needle: []const u8) bool {
    return std.mem.indexOf(u8, line, needle) != null;
}
```

Then the loop says exactly what it does.

```zig
while (try reader.readUntilDelimiterOrEof(&buffer, '\n')) |line| {
    if (contains(line, needle)) {
        try out.print("{s}\n", .{line});
    }
}
```

This is a good habit. Keep I/O code near the edge of the program. Put simple decisions in small functions.

Exercise 20-11. Make the match case-insensitive.

Exercise 20-12. Add a `-v` option that prints lines that do not match.

Exercise 20-13. Print line numbers before matching lines.

Exercise 20-14. Return an error when a line is too long.

Exercise 20-15. Read from standard input when no file name is given.

