Write emit_num in our language -- a function that prints an arbitrary i64 as decimal digits into the output buffer:
check(
\\var out_len: i64 = 0;
\\fn emit_byte(b: i64) i64 {
\\ store8(50000 + out_len, b);
\\ out_len = out_len + 1;
\\ return 0;
\\}
\\fn emit_num(n: i64) i64 {
\\ if (n < 0) {
\\ emit_byte(45);
\\ emit_num(0 - n);
\\ return 0;
\\ }
\\ if (n > 9) {
\\ emit_num(n / 10);
\\ }
\\ emit_byte(48 + n - n / 10 * 10);
\\ return 0;
\\}
\\emit_num(12345);
\\out_len
, 5);
Five bytes written: '1', '2', '3', '4', '5'. The function is recursive -- it divides by 10 to get the higher digits first, then emits the remainder. n - n / 10 * 10 is modulo without the % operator (which our language doesn't have, keeping things simple for the self-hosted compiler to handle).
That's the last piece of the emit puzzle. emit_byte, emit_s, and emit_num -- in our language -- can write any WAT instruction the compiler produces.
We've crossed a threshold. The language now has everything the compiler needs:
| Compiler needs | Language feature |
|---------------|-----------------|
| Read source bytes | load8(addr) |
| Check character class | cur() >= 48 (is digit?) |
| Compare keywords | streq_mem(word, "var") |
| Write WAT output | emit_byte, emit_s, emit_num |
| Recursive descent | Functions with return |
| Track position | var pos: i64 = 0; |
| Skip/scan blocks | while with brace counting |
Every function in our Zig compiler -- c_factor, c_term, c_expression, c_stmt, c_program -- uses only these features. The translation from Zig to our language is mechanical: change the syntax, replace character literals with ASCII codes, use load8/store8 instead of array indexing.
The self-hosted compiler is within reach. It's a big program -- several hundred lines in our language -- but it's not a fundamentally different program. It's the same compiler we already wrote, in a different language. The one we built.