Add lexer
Closed this issue · 1 comments
Dan-wanna-M commented
It is possible to add a lexer stage to process bytes so we can deal with real tokens in Earley recognizer.
Pros:
- Faster since DFA is 5-10 times faster than Earley recognizer.
- Make possible parsing cleaner
Cons:
- More complex and further puzzles the user
- Lexer may not be fully regular, which means we still fall back to some kinds of CFG
- We may gain enough speed by eager regex caching.
Dan-wanna-M commented
eager regex cache is fast enough. In fact currently mask_logits
is the slowest thing.