/CSE_310_Build_Subset_C_Compiler

This repository contains a step by step code implementation for the construction of a compiler for a subset of C language. It was implemented as a part of CSE 310 Compiler Sessional Course

Primary LanguageC++

Build_Subset_C_Compiler

This repository contains a step by step code implementation for the construction of a compiler for a subset of C language

Step 01_Symbol Table Manager

A symbol-table is a data structure maintained by the compilers in order to store information about the occurrence of various entities such as identifiers, objects, function names etc. Information of different entities may include type, value, scope etc. At the starting phase of constructing a compiler, we will construct a symbol-table which maintains a list of hash tables where each hash table contains information of symbols encountered in a scope of the source program


Step 02_Lexical Analyzer

Lexical analysis is the process of scanning the source program as a sequence of characters and converting them into sequences of tokens. A program that performs this task is called a lexical analyzer or a lexer or a scanner. For example, if a portion of the source program contains int x=5; the scanner would convert in a sequence tokens like <INT><ID,x><ASSIGNOP,=><COST_NUM,5><SEMICOLON>. The task will be performed using a tool named flex (Fast Lexical Analyzer Generator) which is a popular tool for generating scanners


Step 03_Syntax & Semantic Analyzer

In this step, we will construct the last part of the front end of a compiler for a subset of the C language. That means we will perform syntax analysis and semantic analysis with a grammar rule containing function implementation in this step. To do so, we will build a parser with the help of Lex (Flex) and Yacc (Bison)


Step 04_Intermediate Code Generation

In this step, we will generate intermediate code for a source program having no error. That means if our source code does not contain any error, which was to be detected in the previous offline, we will generate intermediate code for the source code. We have picked 8086 assembly language as our intermediate representation.