The VM
design-why-vm

Why build a VM instead of using wat2wasm and Node.js?

Three reasons.

First, speed of testing. The agreement test runs in-process -- compile, execute, compare, all in one zig build-exe && ./lang. No spawning processes, no writing temp files, no reading stdout. A hundred tests run in milliseconds.

Second, understanding. Writing the VM forces you to understand what every WAT instruction does. You can't fake i64.eqz -- you have to implement it. If you got the compiler wrong, the VM often fails in a way that reveals exactly what went wrong. The VM is a debugging tool as much as a verification tool.

Third, self-hosting. Eventually we'll write the compiler in its own language. The compiled output will be WAT. To verify it, we'll need to run it. If the VM is part of our Zig host, we can run the self-hosted compiler's output directly. No external tools in the loop.

What we have now: a complete pipeline. Source text goes into the interpreter and the compiler. The interpreter produces a number. The compiler produces WAT. The VM executes the WAT and produces a number. The numbers match. For fib(10), both say 89.

The interpreter tells you what the program means. The compiler encodes that meaning as instructions. The VM follows those instructions. Agreement -- across all inputs, across all features -- is the proof that the encoding is faithful.

Next up: strings, output buffers, and reading stdin. The features the compiler needs to compile itself.

## Part 8: Feeding the Snake

We have a compiler. It reads source text and emits WAT. We have a VM that runs the WAT and gets the same answers as the interpreter. Everything agrees. So what's left?

The compiler is written in Zig. We want it written in our language. For that, we need to look at what the compiler does and ask: can our language do that?

The compiler does three things:
1. Reads bytes from source text: source[pos], cur(), skip()
2. Compares strings to recognize keywords: streq(word, "var")
3. Writes bytes to an output buffer: emit_byte('('), emit_str("i64.const ")

Our language has integers, variables, if/else, while, functions. It does not have byte-level memory access or string literals. Without those, the compiler can't read its input or write its output.

Time to add them.

### Memory

WebAssembly has linear memory -- a flat byte array that programs can read from and write to. Our language will have the same thing: a big array of bytes, accessed by address.

We'll add two builtin functions:
- load8(addr) -- read one byte from address addr
- store8(addr, val) -- write byte val to address addr

That's the entire memory model. One byte at a time. It's enough to build string comparison, output buffers, and everything else the compiler needs.