Type inference your BNF grammar that uses semantic actions, eliminating static errors and porting them into different parser generator architectures.
The major part of this library is written in F#. However, it is compiled into a single-file JavaScript tbnf.js
using Fable.
P.S: Typed BNF used to be implemented with Fable.Python and runs under CPython/PyPy(>=3.8), and I'd like to try Fable.Python again when it gets more stable.
So far, we support 3 different architectures, which unveil the capability of Typed BNF's backend agnostic code generation.
architecture | backend(PGEN + PL) | lexer Impl | parser capability | ADT encoding |
---|---|---|---|---|
antlr | antlr4+csharp | antlr | ALL(*) | type -> interface constructor -> class |
menhir | menhir+OCaml | sedlex(UTF-8) | LR(1) | as-is |
lark | lark+Python | Fable.Sedlex | LALR(2) | type -> union type constructor -> dataclass |
antlr | antlr4+typescript | antlr | ALL(*) | type -> union type constructor -> class |
*pure bnf | pure bnf | antlr notation | CFG |
(PL = programming language; PGEN = parser generator; pure bnf means it is the pure BNF for readable syntax specification )
This project is not production-ready, though already handy for practical use.
For usage, see test-scripts/test-python*.sh
, test-scripts/test-ocaml*.sh
, test-scripts/test-typescript*.sh
and test-scripts/test-csharp*.sh
.
Download the single JavaScript file tbnf.js
from the release page.
usage: tbnf.js [-h] [-o OUTDIR] [-lang LANGUAGE]
[-be {python-lark,ocaml-menhir,csharp-antlr,typescript-antlr,purebnf}]
[-conf CONFIGPATH]
tbnfSourcePath
Argparse example
node tbnf.js xxx.tbnf
positional arguments:
tbnfSourcePath
optional arguments:
-h, --help show this help message and exit
-o OUTDIR, --outDir OUTDIR
-lang LANGUAGE, --language LANGUAGE
name of your own language
-be {python-lark,ocaml-menhir,csharp-antlr,purebnf}, --backend {python-lark,ocaml-menhir,csharp-antlr,purebnf}
-conf CONFIGPATH, --configPath CONFIGPATH
path to a config file
You might check out Typed BNF Documentations.
Such grammar is compiled into Python, OCaml and CSharp. See runtests directory and test-*.ps1
.
extern var parseInt : str -> int
extern var parseFlt : str -> float
extern var getStr : token -> str
extern var unesc : str -> str
extern var appendList : <a> (list<a>, a) -> list<a>
type Json
type JsonPair(name: str, value: Json)
case JInt : int -> Json
case JFlt : float -> Json
case JStr : str -> Json
case JNull : () -> Json
case JList : (elements: list<Json>) -> Json
case JDict : list<JsonPair> -> Json
case JBool : bool -> Json
ignore space
digit = [0-9] ;
start : json { $1 }
int = digit+ ;
float = digit* "." int ;
str = "\"" ( "\\" _ | ! "\"" )* "\"" ;
space = ("\t" | "\n" | "\r" | " ")+;
seplist(sep, elt) : elt { [$1] }
| seplist(sep, elt) sep elt
{ appendList($1, $3) }
jsonpair : <str> ":" json { JsonPair(unesc(getStr($1)), $3) }
/* CPP comments */
json : <int> { JInt(parseInt(getStr($1))) }
| <float> { JFlt(parseFlt(getStr($1))) }
| "null" { JNull() }
| <str> { JStr(unesc(getStr($1))) }
| "[" "]" { JList([]) }
| "{" "}" { JDict([]) }
| "true" { JBool(true) }
| "false" { JBool(false) }
| "[" seplist(",", json) "]" { JList($2) }
| "{" seplist(",", jsonpair) "}" { JDict($2) }
You can specify the renamer_config
parameter or use the default one(tbnf.config.js
in the output directory).
In tbnf.config.js
, you can define how typenames map from Typed BNF to the backend language.
For instance, this is what we did for CSharp-Antlr4 JSON example: link.
function rename_type(x)
{
if (x == "str")
return "string";
else if (x == 'Json')
return 'JsonValue';
else if (x == 'list')
return 'MyList';
else if (x == 'token')
return 'IToken';
else
return x;
}
module.exports = { rename_type };
Typed BNF has 7 built-in types: token
, tuple
, list
, int
, float
, str
and bool
.
Typed BNF ships with no built-in functions, which makes it suitable to write portable grammars without ruling out semantic actions.
P.S: Unlike other backends, the OCaml-Menhir backend requires some manual works and is tedious in this sense. It requires user to explicitly specify the module-qualified type of the start
rule, which can be solved by adding a config variable start_rule_qualified_type
in tbnf.config.js
. Besides, you must map the type token
to tbnf_token
.
This is the config for our example OCaml json parser:
start_rule_qualified_type = "Simple_json_construct.json"
...
Check out Backends.*.fs
This requires the original host implemented in Python. You might need to call pip install -e .
.
git clone https://github.com/thautwarm/Fable.Sedlex FableSedlex
npm install -g typescript antlr4ts-cli
cd tbnf-js && npm install && cd ..
bash ./build-js-package.sh
./build-metaparser.ps1
git clone https://github.com/thautwarm/Fable.Sedlex FableSedlex
./build-package.ps1