Lexical Analysis is the first phase of compiler used for specifying patter-action language
called Lex
. Its main task is to read the input character and produce as output a sequence of tokens
that the parser uses for syntax analysis.
In this project, Lexical Analyzer reads the input from the input.txt
file and display all the tokens in input file in the Output
File.
Tokens
are the sequence of characters that can be treated as a unit/single logical entity.
Pattern
are the set of rules for formation of tokens from input character.
Lexemes
are a sequence of char in source program matched by a pattern for a token.
e.g., Pascal statement const pi = 3.1416; the substring pi is a lexeme from the token “identifier”
When talking about lexical analysis, we use the terms "token," "pattern," and "lexeme" with specific meanings.
- Generating a sequence of tokens.
- Striping out comments and whitespace.
- Making copy of source program with error message marked in it.
Assumptions, that I have made while writting code for lexical analyzer in C++ are:
- Keyword:
int
cin
cout
- Special Symbol:
;
,
{
}
(
)
- Operators:
+
=
>>
<<
- Identifies:
Single character
or sequence ofletters followed by letters or digits
likesum
A
B
C
. - Pre-processor Directives:
include
- Library:
iostream