This is a compiler made for the Computer Architecture I CPU project. The intention is for the program to read a given C file, parse it, and create a new assembly code file. This assembly code is custom made, so it probably won’t make sense without the manual. Note that the processor is stack based, so many of the commands are really simple.
This program takes an input of a c file on the command line (doesn’t necessarily need the .c suffix) and will output a new file appending .asm to the input file’s name.
There are some debug print statements that can be enabled at compile-time using a -D DEBUG
tag when using gcc. Note that by default this is turned off. There is also a compile option -D CLEAN
to format the assembly code in a less pretty, if easier to parse, fashion. Note that this is turned on by default.
- Write a C file that follows the known limitations (see below).
- Compile it using your favorite C compiler (mine is gcc) and debug it there. This program won’t check for your mistakes.
- By default, this program will compile with debugging outputs off and clean (and difficult to read). If you want to change this, open the Makefile. On the CFLAGS line, add
-D DEBUG
to the line if you want debug output, and remove-D CLEAN
if you want fancifully formatted output. - In a shell,
cd
into this directory and typemake
. ./JALACompiler <filename>
will run the program and output a new<filename>.asm
file. Enjoy!
- This program is not a syntax checker. This program assumes that the code you have provided can be compiled without errors using gcc or the like. If you pass it an invalid C file, the program may either crash or quietly generate a non-functioning assembly program. Please, compile with gcc before sending the file into this program.
- No ++ or – operators. Sorry, but the weird things you can do with those make it too weird to justify implementing
- Each expression must be on its own line. You are not allowed, for instance, to do something like this:
//ERROR: Only one instruction is allowed per line. int i = 0; i++;
- Each block header (if, function, for loop, etc.) must also be on their own line. Examples:
//ERROR: Can't put an instruction on the same line as a block header. if(i == 0) { i= 2; } //OK if(i == 0) { i = 2; }
- You must use curly braces for each code block. Even if the block only lasts for one line, you have to have the curly braces.
Also, closing curly braces ‘}’ must be either on their own line or before an else statement.
//Valid if ( i ==0){ i = 2; } else { i = 3; } //Invalid: Curly brace on line with instruction if ( i ==0){ i = 2; }
- if-else-if statements don’t work. This may be ammended in the future. Please use nested if statements inside the else statement:
//Valid if(a == 1) { a = 2; } else { if(a == 2) { a = 3; } } //Invalid if(a == 1) { a = 2; } else if(a == 2) { a = 3; }
- No function calls on their own. Since pointers don’t work, nor do print statements, there is really no reason to call a void function nor to ignore the output of an integer function. Examples:
//Invalid func(a, b, c); //Valid int q = func(a, b, c);
- Logical statements (&&, ||, !) might not work. I haven’t chosen whether or not to do them. If I choose any one in particular to do, I’ll probably do ||, since that is the one which would be hardest to do without.
- The only legal variable type is
int
. The processor will only be able to handle these, so there really is no reason to allow any others. Functions can, however, return eitherint
orvoid
. Maybeint*
will be implemented, but don’t hold your breath. - All code structures except switch statements should work eventually, but I’ll try to get them working in order of functions, if’s, while’s, for’s, and finally do-while’s.
- Block comments (
/**/
) won’t work, but single-line comments (//
) will work fine. - Function previews are not allowed. This means that every function must be defined once and only once and only above the earliest time it is called in the code. This means definitions like:
//Not allowed. You need to define the whole function. int add(int, int);
- Preprocessor instructions (a.k.a. # instructions) will not be available. If there is time near the end, I may handle #define, #include “additionalFile.c”, and #ifdef #endif in that order. There will be no including library files, since those are way too complicated.
- Variable names have to be entirely alphanumeric, meaning no underscores or dashes. Just like standard C, they also cannot start with a number. I may expand this eventually, but it’s just easier this way.
- Multiplication, division, and modulus are not valid operands, since we really don’t have a way to deal with them effectively in assembly.