UniGrammarRuntime.py

Runtime for UniGrammar-generated wrappers for generated parsers. Generated parsers can be used without wrappaers, but wrappers allow to use them uniformly, swapping implementation but keeping the interface.

This allows to

get rid of hard dependencies on specific libraries, instead any supported parser library can be used, for which a parser is generated;
benchmark and compare performance of various parsing libraries;
use the most performant of the available libraries.

How-to use

Generate or construct manually a parser bundle. A parser bundle is an object storing and giving out
- pregenerated parsers for different backends (can be generated standalonely using transpile)
- auxilary information (can be generated using gen-aux):
  - production names to capture groups mappings, for the parser generators not supporting capturing;
  - production names to booleans mappings, telling if the AST node is a collection, for the parser generators not capable to tell the difference between an iterable or a node in AST;
  - benchmark results
  - a wrapper, transforming backend-specific AST into backend-agnostic one Parser bundle can be constructed from a dir on storage or compiled directly into an object in memory. In any case it can be used by a backend.
Construct a backend. A backend here is an object
- storing underlying parser objects
- providing necessary functions to be used by a wrapper to transform backend-specific AST into backend-agnostic one.

There are 2 ways to construct a backend: * You can import the backend manually: from UniGrammarRuntime.backends.<backend name> import <backend class name> and construct it: b = <backend class name>("<your grammar name>", <your bundle>). * Or you can just call a method of the bundle, constructing the needed backend. Pass None to select the backend automatically based on benchmarking results.

Now you can do low-level stuff using backend methods:
- You can parse your grammar into its backend-native format using b.parse("<your string to parse>") method.
- You can preprocess the AST generated by parse and observe the result, using preprocessAST.
- You can check if preprocessed AST nodes represent a collection using isList and iterate over them using iterateList.
- You can transform terminal nodes into strs using getTextFromToken.
- You can merge subtrees into a single str using mergeShit.

This all can be useful if you * don't want to use a generated wrapper * are designing a new Template, so you need the generator to generate custom postprocessing, in order to do it you need to craft it manually first * are debugging * are just playing around

Now we go a level higher. You can use a wrapper to get a prettied backend-agnostic postprocessed AST.
- Import the generated wrapper module.
  - manually import <wrapper module name>
  - Via a backend:
- Then it contains some classes. The class you usually need is aliased to __MAIN_PARSER__.
  - Construct the wrapper, initializing it with the backend: w = <wrapper module name>.__MAIN_PARSER__(b)
- Parse what you need: ast = w("<your string to parse>")

CuchulainX/grammer-UniGrammarRuntime.py

UniGrammarRuntime.py

How-to use

Dependencies