Welcome to the repository containing course materials and code snippets developed for teaching the compiler at Iran University of Science and Technology (UST). This repository includes various language grammars written in ANTLR v4 format. For each grammar, the source code of the generated Lexer and Parser is available in Python 3.8.x.
Please note that this repository is intended to be updated regularly. It would be appreciated if you use our repository by forking it.
For any queries or concerns, feel free to reach out to me at m-zakeri[at]live.com
. Alternatively, you can refer to the documentation for more details.
This section provides examples of the outputs that can be generated by the code snippets in this repository.
Figure 1 illustrates how a single pass compiler can generate three address code for assignment statements with the minimum number of temporary variables, starting with T
:
Input stream:
a1 := (2+3*3) * 6 // Numerical input
a2: := a1 + b * c / (p-q-r) // Symbolic input
Complier results:
9 = 3 * 3
11 = 2 + 9
66 = 11 * 6
Assignment value: a1 = 66
Assignment type: int
-----------
T1 = b * c // Machine generated three-adresses codes
T2 = p - q
T2 = T2 - r
T1 = T1 / T2
T1 = a1 + T1
Assignment value: a1 = 66
Assignment type: str
-----------
Parsing and code generation was done!
Fig 1: Examples of three address codes generated by ANTLR for AssignmentStatement grammar.
Figure 2 demonstrates how a single pass compiler can generate an abstract syntax tree (AST) for assignment statements:
Fig 2: Examples of abstract syntax trees (AST) generated by ANTLR for AssignmentStatement grammar.
The above tree corresponds to the following expressions:
a1 := (2 + 12 * 3) / (6 - 19)
a2 := 2 + 3 * 4
This section describes the structure of the repository:
The grammars
directory contains various grammar files:
gram1
: ANTLR hello world grammar.Expr1
: Simple grammar for handling mathematical expressions without any attribute and action.Expr2
: Simple attributed grammar for handling mathematical expressions withcode()
attribute.Expr3
: Same asExpr2
grammar.AssignmentStatement1.g4
: Grammar to handle multiple assignment statements and mathematical expressions in languages like Pascal and C/C++.AssignmentStatement2.g4
: Same asAssignmentStatement1.g4
grammar with attributes for holding rule code and rule type.AssignmentStatement3.g4
: Grammar to handle multiple assignment statements and mathematical expressions in languages like Pascal and C/C++. It provides semantic rules to perform type checking and semantic routines to generate intermediate representation.AssignmentStatement4.g4
: Similar toAssignmentStatement3.g4
grammar but designed to generate intermediate representation (three addresses codes) with the minimum number of "temp" variables.CPP14_v2
: ANTLR grammar for C++14 forked from the official ANTLR website. Some bugs have been fixed and also the rule identifiers have been added to the grammar rules.EMail.g4
: Lexical grammar to validate email addresses.EMail2.g4
: Lexical grammar to validate email addresses, fixing bugs inEMail.g4
.
The language_apps
package contains Lexer and Parser codes for each grammar in the grammars
directory, along with a main driver script to demonstrate the type checking and intermediate code generation based on semantic rules and semantic routines.
The terminal_batch_script
directory contains several batch scripts to run ANTLR in terminal (Windows) to generate target code in JAVA language. These code snippets belong to my early experiences with ANTLR.
The Lectures
section of this repository offers comprehensive resources for learning Compiler Design. Each lecture is designed to simplify complex concepts and explain them in an intuitive manner through practical examples. The aim is to make the subject matter accessible and engaging for students, regardless of their prior knowledge or experience with compiler design.
ANTLR slides:
For Reading Compiler Design Lectures: See HERE