Recursive descent-parser for Java `jdk18+`

Introduction

This parser is a part of the compiler for the language "Context". You can watch the progress in the weekly german devlog on Youtube.

Features

This is a linear LL(k)-parser with error-recovery which is able to transform a text into an abstract syntax tree (AST) based on the following rules:

Terminals

Name	Short Description
`Literal`	Matches a previously set string.
`Section`	Matches anything between two previously specified characters.
`Pattern`	Matches a specified Regex-Pattern.

NonTerminals

Name	Short Description
`Ordered`	Matches a bunch of rules in their specified order.
`Unordered`	Matches all passed rules in any order.
`Alteration`	Matches any one of the passed rules.
`Multiple`	Matches the passed rule as many times as possible.
`Optional`	Tries to match the passed rule or an empty string.
`Lazy`	Matches the passed rule lazily to enable circular dependencies.

Example

This is the EBNF for a simple mathematical expression:

nr  := '-'? ('0' | ... | '9')+
op  := '+' | '-' | '*' | '/'
exp := nr (op nr)*

And this is the implementation with this library (the boolean-literal describes the optionality of the rule):

Rule nr  = new Pattern("-?[0-9]");
Rule op  = new Alteration(false, "+", "-", "*", "/");
Rule exp = new Ordered(nr, new Multiple(true, op, nr));

Now calling exp.tokenize("10 + -20 * 147") will yield the following tree:

"TokenArray": {
  "LiteralToken": "10",
  "TokenList": [
    "TokenArray": {
      "LiteralToken": "+",
      "LiteralToken": "-20"
    },
    "TokenArray": {
      "LiteralToken": "*",
      "LiteralToken": "147"
    }
  ]
}

Meanwhile calling exp.tokenize("10 xxx * 147") will result in:

"TokenArray": {
  "LiteralToken": "10",
  "TokenList": [
    "ErrorToken": "xxx", <-- Error
    "TokenArray": {
      "LiteralToken": "*",
      "LiteralToken": "147"
    }
  ]
}

Dependencies

This project requires:

Java jdk18
The helper-library

xtay2/recursive-descent-parser