BNFC/bnfc

A tree-sitter backend

wenkokke opened this issue · 7 comments

If we create a tree-sitter backend, we could get basic editor support for all languages using BNFC with almost no work.

Tree-sitter grammars are the defacto way of implementing highlighting in Atom, and there are packages which use tree-sitter grammars to provide highlighting in VSCode, neovim, and emacs. There are also bindings to use a tree-sitter grammar from Java provided by JetBrains, which would help with integration into the Jetbrains editor ecosystem.

There are bindings for tree-sitter are in various languages, including Haskell, JavaScript (both Node.js and Wasm), OCaml, Python, Ruby, and Rust.

Compiling a BNF grammar to tree-sitter should be fairly straightforward, and the only major hurdle I foresee would be to implement support for layouts, which would require some custom C code. For an example of how to implement this, one could look at the grammars for Agda, Haskell, Python, or any other language with layout rules.

This would solve #193, by virtue of the fact that there are tree-sitter bindings for Rust.

Perhaps @banacorn could offer their advice, since they wrote the parser for Agda? It seems that their scanner.cc could be used as-is for top-level layout rules, with only minor adjustments needed to support layout start and stop keywords.

If I understand this correctly, BNFC would have to produce either a grammar.js file to be processed by the tree-sitter CLI, or directly a .json file.

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter. There is:

I wonder how well the Haskell bindings are maintained:

Btw, there is apparently a tree-sitter grammar for LBNF: https://github.com/MortenSchou/tree-sitter-lbnf. Since BNFC is boot-strapped, it could then create this grammar itself.

I'm not sure if there's any advice I can offer or how much of help I can be 👀
It'd be nice not having to translate those tree-sitter grammar by hand anyway!

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter

Wouldn’t this lead to problems supporting layouts?

I wonder whether this could be factored into BNFC -> (E)BNF -> tree-sitter

Wouldn’t this lead to problems supporting layouts?

Yes, this wouldn't support layout, so maybe it is not worth looking into it, unless BNFC/-layout -> (E)BNF has its own interest.

Hi, I have implemented a preliminary tree-sitter backend in #471. If any of the correspondents on this thread are interested, please help me test it out, thank you!