/nextgen_parsing

Do crazy experiments till the whole damn stack is A+, not D+

Primary LanguageOdin

nextgen_parsing

see doc/nextgen_parsing.pdf

--- commit 9f065d6b0c17e2fabb60802d67039a8e5405c47a --- 2024-01-12

As it stands, I see before.phi and after.phi and phi.rwr among a bunch of other files.

Before.phi looks reasonable.

After.phi look suspicious. It looks like a union of Go and Odin code. This should be Go code, only.

I've forgotten Go syntax. The idea is that after.phi should look like before.phi with only a minor change which makes ch into a global int variable containing the value 5. This is C-like thinking. If it violates what is possible in Go, then we should discuss further. I think that ${ch} interpolates the global variable ch and inserts its value into the "hello world..." string. This should be drop-dead easy to write in Go, but, I defer to your better knowledge of Go and how to write this.

You need to have, both, a .ohm file and a .rwr file. There needs to be a 1:1 correspondence between the rules in the .ohm file and the rules in the .rwr file.

The sample phi.ohm file that I wrote was based on parsing C syntax plus the tiny hack to, also, parse def ch ... channels. You need to rewrite it or hack on it to make it parse Go syntax plus a tiny hack to, also, parse def ch ... channels.

The phi.rwr file should rewrite the parsed input into pure Go code. The parameters (the stuff in [ ... ]) should match up with the bits parsed by the phi.ohm grammar. The stuff to the right of = should rearrange the inputs and add characters (if necessary) or ignore syntactic noise (like the def symbol, passed in as _def to the .rwr spec). The parameters in [...] are just names for the partial matches that the grammar engine generates. Each named parameter holds a partial parse tree data structure.

Like I said, I don't know Go syntax very well, but, I'm guessing we might want:

before: def ch 5

defn main [] print "Hello world channel=${ch}"

after: int ch = 5

defn main [] print "Hello world channel=${ch}"

This example is boringly simple and doesn't do much, to keep it simple.

The job of phi.ohm is to recognize the before version. It builds an AST (actually a CST, if you want to be a stickler) and then passes the tree to phi.rwr. [AST is the general "abstract" tree, i.e. everything that could be accepted by the grammar. CST is reality, the "concrete" tree. The CST is a parse tree that exactly matches the input code and nothing more].

The job of phi.rwr is to accept the parse tree generated by phi.ohm and to rewrite the code to look like the after code (assuming that I got it correct).