# Build a Bytecode VM

### Build a Bytecode VM

A bytecode VM is a small machine inside your program.

It does not run source code directly. It runs simple instructions called bytecode.

For example, instead of running this text:

```text
1 + 2
```

A VM might run these instructions:

```text
push 1
push 2
add
print
```

The VM reads one instruction at a time and changes its internal state.

#### The Goal

We will build a tiny stack-based VM.

It will support:

```text
push integer
add
subtract
multiply
divide
print
halt
```

The program:

```text
push 1
push 2
add
print
halt
```

will print:

```text
3
```

#### Instructions

Start with an enum:

```zig
const OpCode = enum(u8) {
    push,
    add,
    sub,
    mul,
    div,
    print,
    halt,
};
```

Each opcode is one instruction.

Some instructions need extra data. `push` needs a number.

So we define an instruction struct:

```zig
const Instruction = struct {
    op: OpCode,
    value: i64 = 0,
};
```

For `push`, `value` matters.

For `add`, `print`, and `halt`, `value` is ignored.

#### The VM State

A stack VM needs a stack.

```zig
const VM = struct {
    stack: [256]i64,
    stack_top: usize,
    instructions: []const Instruction,
    ip: usize,
};
```

The fields mean:

```text
stack         temporary values
stack_top     next free stack slot
instructions  program bytecode
ip            instruction pointer
```

The instruction pointer tells the VM which instruction to run next.

#### Initialize the VM

```zig
fn init(instructions: []const Instruction) VM {
    return .{
        .stack = undefined,
        .stack_top = 0,
        .instructions = instructions,
        .ip = 0,
    };
}
```

At the beginning, the stack is empty and `ip` points to instruction 0.

#### Stack Operations

The VM needs `push` and `pop`.

```zig
fn push(self: *VM, value: i64) !void {
    if (self.stack_top >= self.stack.len) {
        return error.StackOverflow;
    }

    self.stack[self.stack_top] = value;
    self.stack_top += 1;
}
```

This stores the value and moves `stack_top` forward.

Now `pop`:

```zig
fn pop(self: *VM) !i64 {
    if (self.stack_top == 0) {
        return error.StackUnderflow;
    }

    self.stack_top -= 1;
    return self.stack[self.stack_top];
}
```

The last pushed value is the first value returned.

That is why this is called a stack.

#### Running Instructions

The VM runs a loop:

```text
fetch instruction
execute instruction
repeat
```

Add this method:

```zig
fn run(self: *VM) !void {
    while (self.ip < self.instructions.len) {
        const instruction = self.instructions[self.ip];
        self.ip += 1;

        switch (instruction.op) {
            .push => try self.push(instruction.value),

            .add => {
                const b = try self.pop();
                const a = try self.pop();
                try self.push(a + b);
            },

            .sub => {
                const b = try self.pop();
                const a = try self.pop();
                try self.push(a - b);
            },

            .mul => {
                const b = try self.pop();
                const a = try self.pop();
                try self.push(a * b);
            },

            .div => {
                const b = try self.pop();
                const a = try self.pop();

                if (b == 0) {
                    return error.DivisionByZero;
                }

                try self.push(@divTrunc(a, b));
            },

            .print => {
                const value = try self.pop();
                std.debug.print("{d}\n", .{value});
            },

            .halt => return,
        }
    }
}
```

Notice the order in subtraction and division:

```zig
const b = try self.pop();
const a = try self.pop();
```

The right operand is popped first.

For:

```text
push 10
push 3
sub
```

The stack has `10`, then `3`.

`sub` computes:

```text
10 - 3
```

not:

```text
3 - 10
```

#### Complete Program

Put this in `src/main.zig`:

```zig
const std = @import("std");

const OpCode = enum(u8) {
    push,
    add,
    sub,
    mul,
    div,
    print,
    halt,
};

const Instruction = struct {
    op: OpCode,
    value: i64 = 0,
};

const VM = struct {
    stack: [256]i64,
    stack_top: usize,
    instructions: []const Instruction,
    ip: usize,

    fn init(instructions: []const Instruction) VM {
        return .{
            .stack = undefined,
            .stack_top = 0,
            .instructions = instructions,
            .ip = 0,
        };
    }

    fn push(self: *VM, value: i64) !void {
        if (self.stack_top >= self.stack.len) {
            return error.StackOverflow;
        }

        self.stack[self.stack_top] = value;
        self.stack_top += 1;
    }

    fn pop(self: *VM) !i64 {
        if (self.stack_top == 0) {
            return error.StackUnderflow;
        }

        self.stack_top -= 1;
        return self.stack[self.stack_top];
    }

    fn run(self: *VM) !void {
        while (self.ip < self.instructions.len) {
            const instruction = self.instructions[self.ip];
            self.ip += 1;

            switch (instruction.op) {
                .push => try self.push(instruction.value),

                .add => {
                    const b = try self.pop();
                    const a = try self.pop();
                    try self.push(a + b);
                },

                .sub => {
                    const b = try self.pop();
                    const a = try self.pop();
                    try self.push(a - b);
                },

                .mul => {
                    const b = try self.pop();
                    const a = try self.pop();
                    try self.push(a * b);
                },

                .div => {
                    const b = try self.pop();
                    const a = try self.pop();

                    if (b == 0) {
                        return error.DivisionByZero;
                    }

                    try self.push(@divTrunc(a, b));
                },

                .print => {
                    const value = try self.pop();
                    std.debug.print("{d}\n", .{value});
                },

                .halt => return,
            }
        }
    }
};

pub fn main() !void {
    const program = [_]Instruction{
        .{ .op = .push, .value = 1 },
        .{ .op = .push, .value = 2 },
        .{ .op = .add },
        .{ .op = .print },
        .{ .op = .halt },
    };

    var vm = VM.init(&program);
    try vm.run();
}
```

Run:

```bash
zig build run
```

Output:

```text
3
```

#### A More Interesting Program

Try this:

```zig
const program = [_]Instruction{
    .{ .op = .push, .value = 10 },
    .{ .op = .push, .value = 3 },
    .{ .op = .sub },
    .{ .op = .push, .value = 4 },
    .{ .op = .mul },
    .{ .op = .print },
    .{ .op = .halt },
};
```

This means:

```text
(10 - 3) * 4
```

Output:

```text
28
```

The VM evaluates the expression using the stack.

#### What the Stack Looks Like

For this program:

```text
push 1
push 2
add
print
```

The stack changes like this:

```text
start:  []

push 1: [1]

push 2: [1, 2]

add:    [3]

print:  []
```

The `add` instruction removes two values and pushes one result.

That pattern appears often in stack VMs.

#### Add Tests

Add these tests:

```zig
test "push and pop" {
    const program = [_]Instruction{};
    var vm = VM.init(&program);

    try vm.push(42);
    const value = try vm.pop();

    try std.testing.expectEqual(@as(i64, 42), value);
}

test "addition program leaves result on stack" {
    const program = [_]Instruction{
        .{ .op = .push, .value = 1 },
        .{ .op = .push, .value = 2 },
        .{ .op = .add },
        .{ .op = .halt },
    };

    var vm = VM.init(&program);
    try vm.run();

    const result = try vm.pop();
    try std.testing.expectEqual(@as(i64, 3), result);
}

test "division by zero fails" {
    const program = [_]Instruction{
        .{ .op = .push, .value = 1 },
        .{ .op = .push, .value = 0 },
        .{ .op = .div },
        .{ .op = .halt },
    };

    var vm = VM.init(&program);

    try std.testing.expectError(error.DivisionByZero, vm.run());
}
```

Run:

```bash
zig build test
```

#### Why This Is Called Bytecode

Our `Instruction` struct is easy to read, but it is not compact.

Real bytecode often stores instructions in a byte array:

```text
opcode byte
optional operand bytes
opcode byte
optional operand bytes
```

Example:

```text
01 00 00 00 2a
```

This might mean:

```text
push 42
```

The opcode is one byte. The number is stored after it.

Our version uses a Zig struct so beginners can see the idea clearly before dealing with binary encoding.

#### Why Stack VMs Are Popular

A stack VM is simple.

Instructions do not need to name registers.

For example, `add` just means:

```text
pop two values
add them
push the result
```

A register VM might say:

```text
r3 = r1 + r2
```

Register VMs can be faster in some cases, but stack VMs are easier to implement first.

Many language implementations begin with a stack VM because the architecture is small and teachable.

#### What You Learned

You built a tiny bytecode virtual machine.

You defined opcodes.

You represented instructions.

You stored VM state.

You implemented a stack.

You wrote the fetch-execute loop.

You handled runtime errors like stack underflow and division by zero.

This is the core of many interpreters. A real language VM adds variables, functions, jumps, objects, strings, closures, garbage collection, and debugging support. The center is still the same: read an instruction, execute it, move to the next one.

