This repository contains all the various parts needed to bootstrap the following: - mescc-tools (https://github.com/oriansj/mescc-tools), containing: - M1 - blood-elf - get_machine - hex2 - kaem - M2-Planet (https://github.com/oriansj/M2-Planet) - mes-m2 (https://github.com/oriansj/mes-m2). It bootstraps all these from a single 357 byte seed. The ultimate goal is for this to bootstrap all the way up to GCC. This will happen when mes-m2 is finished. There are only two "missing" parts that are not source code; a shell/kaem, and a kernel. Kaem is a very basic build tool that basically evaulates a very simple script. A seed kaem that was compiled in a very optimal way and has been stripped right down is avaliable as kaem-optional-seed in the root of this repository. Otherwise, you can use a shell you trust. The kernel issue is not yet solved and at the moment the kernel is trusted. This repository currently supports AMD64 and x86 (i386) architectures. To run the entire bootstrap process in the safest way, cd into the respective directory for your architecture -- AMD64 for amd64 and x86 for x86/i386 -- then run `../kaem-optional-seed --verbose --strict`. This uses the kaem seed rather then relying on your shell. `--strict` makes sure that the result will be as intended, and nothing breaks. (If something works without --strict but not with --strict please file an issue or come seek help on #bootstrappable on freenode). `--verbose` shows you the commands it is running as it goes. The boostrappable effort is all about trust. You should verify each of these programs, from the hex0 monitor up to mes-m2, along with the kaem seed and the kaem.run files if you can. There are some efforts to attempt to make it easier to verify these binaries. This is done primarily by re-writing the lowest level programs in assembly, so that you can recompile them, checking the hashes match. If they do, verify only the higher-level source since you know that source has the same instructions as the lower-level source. This repository utilises submodules, so you need to clone this repository using `git clone --recursive`. If you have already cloned it run `git submodule update --init` or after a pull be sure to do: git submodule update --recursive Note that this README may not answer all your questions. If you are still left wondering things like What is a kaem.run?, see the other repositories readme's which might answer some more tool-specific questions. We hang out on the freenode IRC network in the #bootstrappable channel. And a full summary of all of the tools can be found here: https://github.com/oriansj/talk-notes/blob/master/bootstrappable.org |-----------------------------| | How does this process work? | |-----------------------------| It is highly recommended that after reading this you go through the kaem.run for your architecture and see each of these steps in action. Note that the kaem.run is split into two kaem files to make it simpler to grasp. These two files are mescc-tools.kaem for Phase 0-12 and mes-m2.kaem for Phase 13, contained in the same folder as kaem.run. Most of these steps have a NASM version in the NASM/ subdirectory of the folder for the architecture. Phase 0: Rebuild hex0 from the hex0 seed. This is done to ensure that the hex0 seed is untainted, and that the hex0 seed matches the compiled hex0 source. You should check these are identical! Phase 1: Build hex1 from the Phase 0 hex0. hex1 is a more advanced version of hex0 with support for single character labels and a single size of relational jumps (hex0 has no support for labels or calculated relational jumps). Phase 1b: Build catm from Phase 0 hex0. catm is a program removing the need for cat or redirection by implementing equivalent functionality; eg cat input1 input2 ... inputN > output_file would be replaced by catm output_file input1 input2 ... inputN Phase 2: Build hex2-0 from hex1. hex2 is the final version of the hex* series adding support for long labels and absolute addresses. This allows it to function as a linker for later parts of the bootstrap. However for now we are only building a basic version to make the process simpler, hence the -0 on the end of the name; as this hex2 only works for the single host architecture it was built upon. Phase 3: Build M0 from Phase 2 hex2-0. M0 is an architecture specific version of M1 which will come later. It is simply a temporary binary that avoids the need to write a cross-architecture assembler in hex2, as M0 supports just enough functionality to build the next few stages. Phase 4: Build cc_* from M0. cc_architecture is a per-architecture C compiler written in the same architecture's M0. Eg, there is cc_amd64 for amd64 and cc_x86 for x86. It implements only an extremely basic form of C that is used to bootstrap the next phase. Phase 5: Build M2-Planet from cc_*. M2-Planet is another C compiler that implements a slightly larger subset of C. However this is not an easily debuggable version and is replaced towards the end. Phase 6: Build blood-elf-0 using M2-Planet. blood-elf adds dwarf stubs to a M1 program allowing us to create more easily debuggable programs. However, this version is not debuggable (as it is built without dwarf stubs) and is indicated by such with -0 on the end. From here on in, all the remaining phases are not intermediete binaries and are used as results. Note that we have been using hex2-0 for the whole time up until now. Also note that now all binaries are debuggable, can generate stack traces, etc, thanks to blood-elf. Phase 7: Build hex2 implementation in M2-Planet. This version of hex2 is cross-platform and has a number of outstanding features which are out of scope here. This is a useful linker that is used in the next stange of the bootstrap process. Note that now we are not using hex2-0; hex2-0 is replaced with hex2. Phase 8: Build M1 implementation in M2-Planet. M1 is a cross-platform version of M0, along with being much more powerful and faster. Note that from now we no longer need catm, as M1 has support for multiple inputs, and that we no longer use M0; it is replaced with M1. Phase 9: Build blood-elf implementation in M2-Planet. blood-elf was discussed earlier and now can be used properly to create debuggable programs with ELF headers. Phase 10: Build kaem. kaem is what was being used to run kaem.run scripts, and is useful for later stages of the bootstrap process outside this repository. Phase 11: Build get_machine. get_machine finds the architecture of the system it is running on, used for architecture dependent scripts used later in the bootstrap process. Phase 12: Build M2-Planet from M2-Planet. This is the same M2-Planet as discussed erlier, it just is built using itself and so is going to work more quickly and reliably. Phase 13: Build Mes-M2 using M2-Planet. Mes-M2 is a re-implemenation of Mes (https://www.gnu.org/software/mes/) designed to make mes part of the bootstrap process. After this is complete, we will be able to bootstrap our way up, through MesCC and TinyCC up to GCC.