Including source information as part of the parse tree

Question

Including source information as part of the parse tree

lsp-ableton opened this issue 3 years ago · 1 comments

Currently, the parser state is only available when an error occurs. It can also be useful to include source information as part of a parse tree. That is, when using rshift to generate a node in the tree, it would be helpful to also include the source file positions which were parsed to generate the node.

If I'm understanding the code correctly, it shouldn't be too difficult to include this information as part of parsing. We could use state before running a parser as the source-start value and the state after running the parser for source-end. I'd be happy to contribute this change, but I wonder if it's welcome.

Basically, this is all I'm talking about:

       @Parser
        def _shift(tokens, s):
            (v, s2) = self.run(tokens, s)
            try:
                return f(v, (s.pos, s2.pos)), s2
            except Exception:
                return f(v), s2

Maybe it should be implemented another way, but this at least produces the desired behaviour for me

Answer 1 · 2021-09-14T13:21:08.000Z

@lsp-ableton Actually you can already track source information if you use funcparserlib.lexer.make_tokenizer() that generates an iterable of Token objects. You can also come up with your custom tokens that do something like that. It doesn't affect the parser itself since it happens at the lexer level.