
The project is an implementation of a generic parser that takes a lexicon file, a grammar file and an input file and produces a concise abstract syntax tree. The desired structure of the tree is defined by the grammar file.

Primary LanguagePythonMIT LicenseMIT

To start using the project, download the zip archive: https://github.com/iensen/genparser/archive/master.zip


  • docs/main/ -- documentation
  • src/astgen/ -- implementation consisting of several python modules
  • .gitignore -- the files ignored by this git repository
  • README.md -- this readme file


  • Python version 3.4 or higher


Unzip the archive and go to the folder src/astgen/


The command line syntax is:

python main.py path_to_lexicon_file path_to_grammar_file path_to_input_file [-s] [-b]

Lexicon file should contain declaration of lexem types, one per line, as defined in section 2.1 of https://github.com/iensen/genparser/blob/master/docs/main/astgen.pdf?raw=true

Grammar file should contain grammar rules, one per line, as defined in section 2.2 of https://github.com/iensen/genparser/blob/master/docs/main/astgen.pdf?raw=true

Input file is an ASCII file as defined in section 3.3 of https://github.com/iensen/genparser/blob/master/docs/main/astgen.pdf?raw=true

An optional argument -s tells the parser not to skip spaces (by default, all the lexems with type 'spaces' are removed from the sequence before parsing).

An optional argument -b tells the lexer to add built-in lexems 'num', 'id' and 'spaces' into the lexicon file (By default, they are not aded).


Examples can be found in src/astgen/tests folder of the distribution.

The execution traces for two of them are given below:

:~/src/astgen$ python3 main.py tests/arith_expr/lexicon tests/arith_expr/grammar tests/arith_expr/input -b
['add', ('num', '1'), ['mult', ('num', '2'), ('num', '3')]]
:~/src/astgen$ python3 main.py tests/chess/lexicon tests/chess/grammar tests/chess/input
  ('move_id', '1.'),
   ('cell', 'e4')
   ('cell', 'e5')
  ('move_id', '2.'),
    ('figure', 'Q')
   ('cell', 'h5')
    ('figure', 'N')
   ('cell', 'c6')
  ('move_id', '3.'),
    ('figure', 'B')
   ('cell', 'c4')
    ('figure', 'N')
   ('cell', 'f6')