/ocaml-tree-sitter-semgrep

Generate parsers from tree-sitter grammars extended to support Semgrep patterns

Primary LanguageGoGNU General Public License v3.0GPL-3.0

ocaml-tree-sitter-semgrep

CircleCI

Generate OCaml parsers based on tree-sitter grammars, for semgrep.

Related ocaml-tree-sitter repositories:

  • ocaml-tree-sitter-core: provides the code generator that takes a tree-sitter grammar and produces an OCaml library from it.
  • ocaml-tree-sitter-languages: community repository that has scripts for building and publishing OCaml libraries for parsing a variety of programming languages.
  • ocaml-tree-sitter-semgrep: this repo; same as ocaml-tree-sitter-languages but extends each language with constructs specific to semgrep patterns.

Contributing

Development setup

  1. Make sure you have at least 6 GiB of free memory. More will be needed for some of the grammars.
  2. Install the following tools:
    • git
    • GNU make
    • pkg-config: manages the installation of tree-sitter's runtime library
    • Node.js: JavaScript interpreter used to translate a grammar to json
    • cargo: Rust compiler used to build tree-sitter
    • opam: OCaml package manager
  3. Run opam init, opam switch create 4.12.0 to install a recent version of OCaml.
  4. Install ocaml dev tools for your favorite editor: typically opam install merlin + some plugin for your editor.
  5. Install pre-commit with pip3 install pre-commit and run pre-commit install to set up the pre-commit hook. This will re-indent code in a consistent fashion each time you call git commit.
  6. Check out the extra instructions for MacOS.

See the Makefile for the available targets. Get started with:

make update
make setup

Then build and install the OCaml code generator (core):

make && make install

Testing a language

Say you want to build and test support for kotlin, you would run this:

$ cd lang
$ ./test-lang kotlin

For details, see How to upgrade the grammar for a language.

Adding a new language

See How to add support for a new language.

Documentation

We have limited documentation which is mostly targeted at early contributors. It's growing organically based on demand, so don't hesitate to file an issue explaining what you're trying to do.

License

ocaml-tree-sitter is free software with contributors from multiple organizations. The project is driven by r2c.

  • OCaml code developed specifically for this project is distributed under the terms of the GNU GPL v3.
  • The OCaml bindings to tree-sitter's C API were created by Bryan Phelps as part of the reason-tree-sitter project.
  • The tree-sitter grammars for major programming languages are external projects. Each comes with its own license.