A compiler is a software program that transforms high-level source code that is written by a developer in a high-level programming language into a low level object code (binary code) in machine language, which can be understood by the processor.
The compilation process is a sequence of various phases. Each phase takes input from previous stage, has its own representation of source program and feeds its output to the next phase of the compiler.
This project provides code for Lexical Analysis using Flex, Parsing using Bison, and Parse Tree generation using GNU C. The project also provides a Graphical User interface to interact and see output at intermediate phases of the compilation process.
Flex is a tool for generating scanners: programs which performs lexical analysis phase. Flex reads the given input files for a description of the scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. Flex generates as output a C source file, lex.yy.c', which defines a routine yylex. Compile and link this file with the
-lfl' library to produce an executable. When the executable runs, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding C code.
In our project we also used flex to generate tokens for bison. Performing lexical analysis and generating token using flex is a much simpler task.
Bison is a general-purpose parser generator that converts a grammar description (Bison Grammar Files) for an LALR(1) context-free grammar into a C program to parse that grammar i.e syntax analysis. The Bison parser is a bottom-up parser. It tries, by shifts and reductions, to reduce the entire input down to a single grouping whose symbol is the grammar's start-symbol.
Similarly Bison can be used to perform Semantic Analysis by building an AST and evaluating its nodes values using C/C++ code.
In our project we used Bison,
- To define our grammar
- To parse the grammar
- Print the parse tree generated by the grammar.
Our grammar is as follows:
pgmstart : TYPE ID '(' ')' STMTS
;
STMTS : '{' STMT1 '}'
;
STMT1 : STMT STMT1
|
;
STMT : STMT_DECLARE //all types of statements
| STMT_ASSGN
| STMT_IF
| STMT_WHILE
| STMT_SWITCH
| ';'
;
STMT_DECLARE : TYPE ID IDS //setting type for that line
;
STMT_ASSGN : ID ASGN EXP ';'
;
STMT_IF : IF EXP STMT %prec IF
| IF EXP STMT ELSE_STMT
| IF EXP STMTS ELSE_STMT
;
ELSE_STMT : ELSE STMT
| ELSE STMTS
|
;
STMT_WHILE : WHILE EXP WHILEBODY
;
WHILEBODY : STMTS
| STMT
;
STMT_SWITCH : SWITCH EXP '{' SWITCHBODY '}'
;
SWITCHBODY : CASES
| CASES DEFAULTSTMT
;
CASES : CASE NUM ':' SWITCHEXP BREAKSTMT
|
;
BREAKSTMT : BREAK ';' CASES
| CASES
;
DEFAULTSTMT : DEFAULT ':' SWITCHEXP DE
;
DE : BREAK ';'
|
;
SWITCHEXP : STMTS
| STMT
;
IDS : ';'
| ',' ID IDS
;
TYPE : INT
| FLOAT
| VOID
;
EXP : EXP LT EXP
| EXP LE EXP
| EXP GT EXP
| EXP GE EXP
| EXP NE EXP
| EXP EQ EXP
| EXP '+' EXP
| EXP '-' EXP
| EXP '*' EXP
| EXP '/' EXP
| EXP LOR EXP
| EXP LAND EXP
| EXP BOR EXP
| EXP BXOR EXP
| EXP BAND EXP
| EXP INC
| EXP DEC
| '(' EXP ')'
| ID
| NUM
| DNUM
;
To run the compiler navigate to the "cc_project_gui.py" and run the file using python3.
Note: Make sure that the prog and parser output files are in the same directory as the python script.