/CSCI425-Parser

Basic LL(1) and LR(0) parser implementations in Python, as well as a partial RegEx compiler: Project for CSCI425 (Compiler Design)

Primary LanguagePython

Simple Parser

ZOBOS (Compilers Project 4)

Big Ideas

  1. Parse the token stream into a CST, check for syntax. Token stream provided in .tok files
  2. During the parse, massage the tree into an AST
  3. Check for semantic errors in the AST

WORK TODOS

  • Parse a token stream with LR - Liam
    • Use LR knitting
    • Use SLR table
    • Use zlang.cfg
    • Check for syntax errors
    • print out
  • Develop SDTs to make an AST during the above parse - Konch and Andrew
    • EXPR will be simplified to
      • leaves are literals or variables
        • BEXPR
        • AEXPR
          • PRODUCT
          • SUM
            • FUNTYPE
              • Made test
            • GLOBTYPE
              • Made test
            • VALUE
              • Made test for value literals
              • Made test for lparen EXPR rparen
      • root and internal nodes are non-termianls:
        • BOOLS
        • PLUS
        • TIMES
        • UNARY
          • Made test for UNARY
        • CAST
          • Made test for CAST
        • FUNCALL
    • Control Structures will be simplified
      • IF
      • IFELSE
      • WHILE
      • STMTS
      • BRACESTMTS
    • Simple Tree representation of AST to disk
    • Handle higher order colapsing
  • Semantic checks on AST on variable, expression and function - Chris
    • Emit will write symbol tables to disk
    • Warnings and errors will be emitted
  • ZOBOS will exit(0) if no errors, exit(1) on error
  • Symtable
    • Location
    • Identifier
    • Type
    • const Flag
    • used or used flag
    • Initialized or Uninitialized Flag
    • Funtion Semantics
      • Function prototype encountered, const flag to false, but already init
      • Function definition is encountered, the symbol has its const and initialized flag to true
    • Emit
      • Global scope is scope 0
      • on Emit, print the following to the third command line argument, comma seperated
        • On one line
        • Scope
        • Type
          • if its a function, type is followed by //, so int//
        • id

WRECK (Compilers Project 3)

ABOUT

This project is a collaborative project for CSCI425: Compiler Design TODO TODO

USAGE

python wreck.py <grammar config> [token stream] [parse tree output file]

Grammar config defines a language in plain text (file extension: .cfg) EX:

S -> A C $
C -> c
   | lambda
A -> a B C d
   | B Q
   | lambda
B -> b B | d
Q -> q

All grammar terminals are assumed to be lowercase.

  • -> denotes the production rule associate
  • | is reserved for rule alternation
  • lambda specifies the empty string

Language config files define a stream of language tokens (intermediate format void a lexer framework; extension: .tok)

Each token is separated by newlines, each line containing either TOKEN or TOKEN TOKENVALUE

CLI example

python3 parser.py config/language-slides/language.cfg config/language-slides/src.tok