Simplify grammar
QuarticCat opened this issue · 2 comments
QuarticCat commented
Here are some possible ergonomic improvements.
- Set
active
as default so that users only need to marksilent
rules. And for silent rules, we can use some special character or naming style to make them clean. Combine lexer definitions and tokens.(implemented in #56)To ensure all tokens are terminal, we only need to check if the reference graph is a DAG, and then inline all rules.To avoid generating extra lexers, we can delay the generation of lexers after the generation of parsers, and inline & generate lexers by need.
Combine parser definitions and fixpoints.(implemented in #47)We may automatically infer fixpoints. A possible algorithm is to find cycles in the reference graph and then mark all vertices in cycles as fixpoints.
- Remove
~
(sequence operator). Instead of writinge1 ~ e2
, we can simply writee1 e2
. - Ad-hoc lexical rule. For example,
"(" ~ sexprs ~ ")"
.
SchrodingerZhu commented
We are thinking of extending our system such that not only trees, but arbitrary data types are supported as parser output as well.
However, this brings difficulties to apply TCO. Thus, it is still not clear to me how should the design go.
SchrodingerZhu commented
It seems to me that we can separate rules into two parts (not counting offset
and src
):
- A negative rule that accepts a
&mut Consumer
and returnsResult<(), Error>
. (This can be tail-call optimised.) (The&mut Consumer
, for example, can be a&mut Vec<T>
). - A positive rule that accepts nothing and returns
Result<T, Error>
.
However, it is not clear that what will happen when we need to expand actions. We also need to figure out a way to really specify such rule properly.