Engelberg/instaparse

Retain comments in resulting grammar

kanaka opened this issue · 2 comments

I'm working on a project that creates test.check/chuck generators based on EBNF grammars. One really useful feature that I would like to implement is inline weight information:

foo-rule = bar-rule (* weight: 20 *)
         | baz-rule (* weight: 10 *)

When used as a parser, this grammar should treat the comments as whitespace (as it does now). However, I would like the grammar tree to still retain the comments in some form so that when I am deriving the test.check generators I can assign some default weights to those generators.

I'm willing to implement this, however I could use some high-level guidance on the appropriate way to add this functionality. My guess is that I will be modify to cfg.clj:cfg to un-hide the comments and modify cfg.clj:build-rule to do something with :opt-whitespace (although what it should do with it I'm not entirely clear on).

I'm traveling, and haven't had much time to think about this. I'm pleased you're interested in contributing, but be aware that as more people have adopted instaparse for critical programs, I've become very conservative about adding changes unless I am completely certain it won't slow down existing use cases. So focus initially on developing something that is useful for you, and then we can analyze the possibility of a pull request more closely. If it's advice you need about the innards, I have more of an opportunity to discuss that in a week or two. I'm pinging @aengelberg here, since he's spent more time with test.check generators than I.

I would like to 👍 this issue and the corresponding pull request by @kanaka . The combination of instaparse and instacheck (and underlying test.chuck) is extremely useful for testing and automation. Thank you.