/grammer-UniGrammarRuntime.py

Runtime for UniGrammar-generated wrappers for generated parsers.

Primary LanguagePythonThe UnlicenseUnlicense

UniGrammarRuntime.py Unlicensed work

GitLab Build Status Coveralls Coverage GitLab Coverage Libraries.io Status

Runtime for UniGrammar-generated wrappers for generated parsers. Generated parsers can be used without wrappaers, but wrappers allow to use them uniformly, swapping implementation but keeping the interface.

This allows to

  • get rid of hard dependencies on specific libraries, instead any supported parser library can be used, for which a parser is generated;
  • benchmark and compare performance of various parsing libraries;
  • use the most performant of the available libraries.

How-to use

  • Generate or construct manually a parser bundle. A parser bundle is an object storing and giving out

    • pregenerated parsers for different backends (can be generated standalonely using transpile)
    • auxilary information (can be generated using gen-aux):
      • production names to capture groups mappings, for the parser generators not supporting capturing;
      • production names to booleans mappings, telling if the AST node is a collection, for the parser generators not capable to tell the difference between an iterable or a node in AST;
      • benchmark results
      • a wrapper, transforming backend-specific AST into backend-agnostic one Parser bundle can be constructed from a dir on storage or compiled directly into an object in memory. In any case it can be used by a backend.
  • Construct a backend. A backend here is an object

    • storing underlying parser objects
    • providing necessary functions to be used by a wrapper to transform backend-specific AST into backend-agnostic one.

There are 2 ways to construct a backend: * You can import the backend manually: from UniGrammarRuntime.backends.<backend name> import <backend class name> and construct it: b = <backend class name>("<your grammar name>", <your bundle>). * Or you can just call a method of the bundle, constructing the needed backend. Pass None to select the backend automatically based on benchmarking results.

  • Now you can do low-level stuff using backend methods:
    • You can parse your grammar into its backend-native format using b.parse("<your string to parse>") method.
    • You can preprocess the AST generated by parse and observe the result, using preprocessAST.
    • You can check if preprocessed AST nodes represent a collection using isList and iterate over them using iterateList.
    • You can transform terminal nodes into strs using getTextFromToken.
    • You can merge subtrees into a single str using mergeShit.

This all can be useful if you * don't want to use a generated wrapper * are designing a new Template, so you need the generator to generate custom postprocessing, in order to do it you need to craft it manually first * are debugging * are just playing around

  • Now we go a level higher. You can use a wrapper to get a prettied backend-agnostic postprocessed AST.
    • Import the generated wrapper module.
      • manually import <wrapper module name>
      • Via a backend:
    • Then it contains some classes. The class you usually need is aliased to __MAIN_PARSER__.
      • Construct the wrapper, initializing it with the backend: w = <wrapper module name>.__MAIN_PARSER__(b)
    • Parse what you need: ast = w("<your string to parse>")

Dependencies