Bootstrap is a small VM (< 20 ops) with an ASCII encoding. The goal of this project is to create a readable and auditable bootstrapping process to generate C binaries for this virtual platform or any other.
- Trusted compilation - every program involved in compiling a given C program can be audited (combined with diverse double-compilation by running the VM on multiple platforms)
- Longevity - the VM spec is small enough that it can be contained in the executables it produces, allowing them to be run decades into the future
Each bootstrap stage should do just enough to compile the next stage. Our goal is to hit a level of C89/C99 support that will allow us to compile and run arbitrary software for the VM, and to build the entire software tower underneath to get us there.
The stages should be easy to understand in isolation, and enough to hold one-at-a-time in your head.
In some cases we may define useful compilation utilities in earlier stages that are re-used later in the bootstrap chain, for example linkers and shell-style utilities.
Status: complete ✅
bootstrap0.bin
: A basic assembler written in pure VM ASCII. The goal of this stage is
to get a slightly more readable bootstrap1.bin
compiled by ignoring any control character bytes (< 0x20).
Status: complete ✅
bootstrap1.s
: A basic assembler that skips comments (lines starting with #
) and allows
the use of a colon address (ie: :ABCD
) to seek the output file to a given hex offset. All "assembled" lines must start with a tab character.
Status: complete ✅
bootstrap2.s
: A more complex assembler with support for two-level symbols (ie: :global__
+ .local___
)
and two-pass symbol resolution. Also supports constant-style symbols that can be defined via =symbol__ ABCD
. Note that all symbols MUST
be eight characters long - no more, no less. Includes a few hard-coded stack manipulation macros in this stage to make nested function calls simpler.
Status: complete ✅
bootstrap3
: A "complete" assembler that allows input from multiple files, linked together to create an output
executable. This assembler has a more natural, intel-like syntax.
The output for a given opcode from this assembler may or may not correspond to a single VM opcode. The compiler takes over one of the VM registers as a "compiler temporary", allowing us to create some CISC-style ops that drastically reduce instruction counts for various types of operations.
This assembler also allows for more complex macros that make procedure calls, arguments and locals much simpler. As part of this functionality, the compiler defines a calling convention that determines which registers are caller- or callee-saved.
Status: work in progress 🚧
bootstrap4
:This is the first stage C compiler that compilers a (very reduced) subset of C. Currently a work-in-progress.
There are multiple stages inside bootstrap4
to build a basic compiler: compiler0
which builds a barebones C compiler and allows us to
escape from the world of assembly, and compiler1
that is a much more familiar C program that is used to compile bootstrap5
.
Status: proof-of-concept
A full C85 compiler written in a simpler subset of C than can compile a full CXX compiler (as long as it conforms to C85). Currently a work-in-progress.