Introduction ============= Almost every programmer dreams of building his own OS and Compiler, among other things, and I am no exception. But I didn't study computer science, so the finer arts of Parsing, Lexing, Ast Generation had eluded me for a long time. Sometime in 2007, I started learning about compiler techniques and using them at work incrementally. And I have now understood, [F]Lex and its clones, Yacc|Bison and clones and AST generation. This project is to help me crystallize this understanding into a toy, but fully functional. Therefore, I aim to design a small toy language called 'my' in the spirit of mySQL, mySpace, and other 'my'-infected software ;) My will initially be designed from the back to the front, and the first backend will be a small simple bytecode interpreter. Obviously, it will be very naive and I will not expand its scope much, except to accomodate humble features of the language 'My'. There will also be an assembler for this interpreter and then eventually the My compiler which shall compile the My language into the My vm format. If you wish to learn along with me, by all means, clone away. I building this in C, to keep to the true spirit of compiler writers of days or yore and I am striving to make my code understandable first (performance is not even in the picture here). Hopefully, this will be usefull to more ppl than 'My'-self (ahem! no pun intended :P) Vm Design ============ Myvm is a very very simple virtual machine. It has only 6 registers 'A' to 'F' and an instruction pointer. All programming will be done using these registers. To make matters worse, each register is 1byte, making it effectively useless for real work, but easy to implement :) Byte Code Design ================ A break down of the binary that is recognized by the Myvm interpreter is broken down below: HEADER = MVM (3bytes) VERSION = 01 (1byte) Instruction Count = 4 bytes Code Segement Offset = 4 bytes Unused = 4 bytes Code Segment - remaining Executable Code Design ====================== In the spirit of keeping things easy, I have decided to limit the instructions even further. Each instruction is just 3 bytes wide and is of a fixed format: OPCODE - 1byte SOURCE - 1byte TARGET - 1byte E.g. 0x000100 So a Myvm binary will contain blocks of 3 bytes in the code segment and this will make things easier to code for. Status ======= Currently, Myvm has been implemented with very few instructions, but enough to implement simple comparisons and loops. See below for more info. It can currently execute binaries in the agreed format and the examples/ folder contains some example which were hand assembled with a hex editor. This would be a good place to start learning from. The assembler is currently in the works. More Info ========== Look into include/myvm.h instruction opcodes and register addresses. All currently implemented instructions are in ops.c