/grammar-tools

Tokenization and parsing Kotlin code using the ANTLR Kotlin grammar

Primary LanguageKotlinApache License 2.0Apache-2.0

kotlin-grammar-tools

JetBrains team project TeamCity (simple build status) GitHub license

Description

This library allows tokenize and parse Kotlin code in your program using the Kotlin grammar.

Simple example:

fun main() {
    val tokens = tokenizeKotlinCode("val x = foo() + 10;")
    val parseTree = parseKotlinCode(tokens)
    // or just `val parseTree = parseKotlinCode("val x = foo() + 10;")`

    println(parseTree)
}

Tokens or parse tree can be used for various Kotlin code analysis.

Note that the parse tree may not match exactly to PSI (parse tree generated by Kotlin compiler). This is due to the fact that some errors for the user convenience are not generated at the parser level, but later; the grammar, in turn, takes into account such cases and may not allow the code that could be parsed by the Kotlin compiler parser.

Kotlin grammar

The grammar is located in the Kotlin specification repository.

Status

The library is developed only for internal purposes of the Kotlin team, and actual state of the library isn't guaranteed.

Using

To use the library, you need to perform the following steps.

  1. (Prerequisite) Get the Kotlin specification repository and run its :grammar:publishToMavenLocal gradle task. This will build and install the kotlin-grammar-parser dependency.
  2. Run the publishToMavenLocal gradle task. This will build and install the kotlin-grammar-tools library.
  3. Add mavenLocal to repositories in your project, and then add the dependency for this library. For example (gradle): implementation("org.jetbrains.kotlin.spec.grammar.tools:kotlin-grammar-tools:0.1").

As an alternative for steps 1 and 2, you can just download the jars from Releases or TeamCity (both kotlin-grammar-parser and kotlin-grammar-tools artifacts can be found on the TeamCity Kotlin grammar build page).

Exceptions

Lexer and parser are throwing exceptions if it has been inputted the invalid code (in terms of lexer or parser): KotlinLexerException and KotlinParserException.

Example of handling this exceptions:

fun foo(): ParseTree? {
    val tokens = try {
        tokenizeKotlinCode("val x = foo() + 10;")
    } catch (e: KotlinLexerException) {
        println("Tokenization the code fails")
        return null
    }
    val parseTree = try {
        parseKotlinCode(tokens)
    } catch (e: KotlinParserException) {
        println("Parsing the code fails")
        return null
    }

    return parseTree
}