Alternative parser: test Lark
decalage2 opened this issue · 0 comments
decalage2 commented
In practice, the current VBA parser implemented with pyparsing is very slow. In the past I made some tests with ANTLR4 (issue #19), but its python runtime is even slower than pyparsing.
Other issues with pyparsing:
- the grammar of the parser and the VBA emulation layer are mixed together because we're using the same classes for both. While it is convenient for a small grammar, with a complex parser such as ViperMonkey it makes maintenance difficult, and it is impossible to test different parsers without touching the emulation layer.
- the current grammar tries to parse the whole VBA code in one go, so any exception breaks the whole parsing. I started to develop a line-based parser but it's too much work.
- debugging the grammar is very difficult.
Lark is another parser for python that looks faster than pyparsing, and could allow us to separate the parser from the VBA emulation engine: https://github.com/lark-parser/lark
And there are other options: https://tomassetti.me/parsing-in-python/