/parsing

Parsing exploration.

Primary LanguageHaskell

2016, January

I've read a lot more about grammars and parsing since I stopped writing code for this project. Reading is all I'm allowing myself time for, but whatever. I'm updating this readme with a few more names of interesting techniques as a reminder to myself. Consider them keywords for your internet searches, though I'll also provide a few links of things I've read:

2014, August

parsing

Ok, so I took a Compiler Construction course, learnt how LL and LR parsing work, heard about other things like SGLR, PEG, Attribute grammars then read some more and also found RNGLR and GLL. I figured I'd take up implementing SGLR in Haskell as a little summer project (I'm aware this has been done before, but I just want the learning experience). I downloaded some papers on LR/SGLR parsing. Though old, one paper mentioned that generating a full parser instead of tables for parsing engines resulting in faster parsing times. I got distracted by that, thinking that optimisations put into generated parsers should be transferable to parsing engines and table generators. Unless the problem was cache-behaviour (CPU caches were still stuck in my head from a video I watched recently).

So now I don't know where I'm going with this repo. I want to try to make a cache-friendly LR parsing engine. But I don't know if that new or not. I also still want to study the details of SGLR a little better. The SRNGLR paper was also interesting. To actually find out if what I have in mind works --- to make an LR parse engine more cache-friendly --- will probably require that I implement it in some low-level language like C. But meh..

So I'm not sure where I'm going with this code. For now it's an LR(1) parsing engine and parse table generator. Don't have tests yet. The table generator is missing a piece. But the parsing engine seems to work, and I was able to remove explicit state numbers from the stack.

example files

There are two example files here:

  1. parseExample.hs takes a parse table example from the book "Modern compiler implementation in Java" by Andrew W. Appel. In particular it's the table for grammar 3.1, table 3.19. This example uses ADTs to describe the input tokens and the non-terminals. By deriving Enum instances, these can be changed into numbers for the table. The code isn't very interesting otherwise, but it's nice to see that the table works with the parsing engine.
  2. tableGenExample.hs takes a grammar --- this time from wikipedia --- and transforms it into a table. This time the code is just a small amount that show how to use the exposed API from SGLR.TableGen.Rule. The grammar is slightly adapted from the one on wikipedia, as it also specifies some constructor names (so you don't have to define a whole AST type yourself like in parseExample.hs).