FL is a project that aims to create a simple programming language and its interpreter. The project is written in C# and it uses ANTLR to generate a parser for a language. After the parser checks for syntax errors, then it goes and checks type checking. The language then generates instructions that are then processed through a virtual machine to get you an output. The language can do basic arithmetic operations, conditional statements, and loops.
Detailed assignment there
The project will be composed of the following steps:
- Using ANTLR, implement a parser for the language specified below. If there is at least one syntax error, report this error (or errors) and stop the computations.
- If there are no syntactic errors, perform the type checking. If there are some type of errors, report all these errors and stop the computation.
- If there are no type errors, generate the appropriate target code. It will be a text file composed of stack-based instructions that are defined below.
- Implement an interpreter, that gets a text file with these instructions and evaluates them.
The program consists of a sequence of commands. Commands are written with free formatting. Comments, spaces, tabs, and line breaks serve only as delimiters and do not affect the meaning of the program. Comments are bounded by two slashes and the end of the line. Keywords are reserved. Identifiers and keywords are case sensitive.
There are the following literals:
- integers -
int
- sequence of digits. - floating point numbers -
float
- sequence of digits containing a'.'
character. - booleans -
bool
- values:true
andfalse
. - strings -
string
- text given in quotation marks:"text"
. Escape sequences are optional in our strings.
Variable identifiers are composed of letters and digits, and it must start with a letter. Each variable must be declared before it is used. Repeated declaration of a variable with the same name is an error. Variables must have one of the following types: int
, float
, bool
or string
. After the variables are declared, they have initial values: 0
, 0.0
, ""
respectively false
.
The following statements are defined:
;
- empty command.type variable, variable, ... ;
- declaration of variables, all these variables have the same typetype
. It can be one of:int
,float
,bool
,String
expression ;
- it evaluates given expression, the resulting value of the expression is ignored. Note, there can be some side effects like an assignment to a variable.read variable, variable, ... ;
- it reads values from standard input and then these values are assigned to corresponding variables. Each of these input values is on a separate line and it is verified, that have an appropriate type.write expression, expression, ... ;
- it writes values of expressions to standard output. The"\n"
character is written after the value of the last expression.{ statement statement ... }
- block of statements.if (condition) statement [else statement]
- conditional statement - condition is an expression with a type:bool
. The else part of the statement is optional.while (condition) statement
- a cycle - condition must be abool
expression. This cycle repeats the given statement while the condition holds (it istrue
).
Lists in expression trees are literals or variables. Types of operands must preserve the type of the operator. If necessary, int
values are automatically cast to float
. In other words, the type of 5 + 5.5
is float
, and the number 5
which type int
is automatically converted to float
. There is no conversion from float
to int
!
The following table defines operators in our expressions. Operator Signature is defined using letters: 'I, R, B, S' which corresponds to types: int
, float
, bool
, string
.
Description | Operator | Operator's Signature |
---|---|---|
unary minus | - |
I → I ∨ F → F |
binary arithmetic operators | +, -, *, / |
I × I → I ∨ F × F → F |
modulo | % |
I × I → I |
concatenation of strings | . |
S × S → S |
relational operators | < > |
x × x → B, where x ∈ {I, F} |
comparison | == != |
x × x → B, where x ∈ {I, F, S} |
logic and, or | && || |
B × B → B |
logic not | ! |
B → B |
assignment | = |
x × x → x, where x ∈ {I, F, S, B} |
In the assignment, the left operand is strictly a variable and the right operand is an expression. The type of the variable is the type of the left operand. A side effect is storing the value on the right side into the variable. The automatic conversion cannot change the type of the variable, i.e., it is impossible to store float
value in int
variable.
We can use parentheses in expressions.
All operators (except =
) have left associativity (=
have right associativity), and their priority is (from lowest to highest):
=
||
&&
== !=
< >
+ - .
* / %
!
unary -
All instructions are stack-based. The main memory is a stack and while evaluating the instructions, the input data are taken from the stack and the results are put also in the stack.
Instruction | Description |
---|---|
add |
binary + |
sub |
binary - |
mul |
binary * |
div |
binary / |
mod |
binary % |
uminus |
unary - |
concat |
binary . - a concatenation of strings |
and |
binary && |
or |
binary || |
gt |
binary > |
lt |
binary < |
eq |
binary == - compares two values |
not |
unary ! - negating boolean value |
itof |
Instruction takes int value from the stack, converts it to float, and returns it to stack. |
push T x |
Instruction pushes the value x of type T . Where T represents I - int , F - float , S - string , B - bool . Example: push I 10, push B true, push S "A B C " |
pop |
Instruction takes one value from the stack and discards it. |
load id |
Instruction loads value of variable id on the stack. |
save id |
Instruction takes value from the top of the stack and stores it into the variable with the name id |
label n |
Instruction marks the spot in source code with the unique number n |
jmp n |
Instruction jumps to the label defined by unique number n |
fjmp n |
Instruction takes a boolean value from the stack and if it is false , it will perform a jump to a label with the unique number n |
print n |
Instruction takes n values from the stack and prints them on standard output |
read T |
Instruction reads the value of type T (I - int , F - float , S - string , B - bool ) from standard input and stores it on the stack |
- Lexer and parser - The lexer and parser are made with ANTLR after you define your grammar, which is done in the FL.g4 file. Using it, appropriate classes are generated into the project and then used in Program.cs for checking the syntax errors. It's trying to parse input which you can specify in Program.cs.
- Type checking - Type checking is done in the EvalVisitor class, using the parse tree generated by the parser and checking if the types of operands in the expressions are correct. It also checks for any bad assignment to a variable. SymbolTable is used for storing information about variables and checking if they are already declared or not declared at all.
- Making instructions - Making instructions is done in the EvalCompute.cs class. It takes a parse tree, which was type-checked, and generates stack-based instructions that are executed by an virtual machine.
- Virtual machine - The Virtual machine reads generated instructions and executes them using a stack and dictionary which acts as a memory.
Code written | Instructions | Output in terminal |
---|---|---|
float a; a=1+2*3.3; write a; if(4>5) { write "4>5"; } else { write "4<5"; } int b; while(b<5) { write "b=", b; b=b+1; } |
push F 0.0 save a push I 1 itof push I 2 itof push F 3.3 mul F add I save a load a pop load a print 1 push I 4 push I 5 gt fjmp 8 push S "4>5" print 1 jmp 9 label 8 push S "4<5" print 1 label 9 push I 0 save b label 10 load b push I 5 lt fjmp 11 push S "b=" load b print 2 load b push I 1 add save b load b pop jmp 10 label 11 |
[Info] | Parsing: Input_files/input.txt [Info] | No syntax and type-check errors found. [Info] | Generating instructions... [Info] | Instructions generated. [Info] | Running virtual machine... 7.6 4<5 b= 0 b= 1 b= 2 b= 3 b= 4 [Info] | Virtual machine finished. Exiting... |