This parser is a part of the compiler for the language "Context". You can watch the progress in the weekly german devlog on Youtube.
This is a linear LL(k)-parser with error-recovery which is able to transform a text into an abstract syntax tree (AST) based on the following rules:
Name | Short Description |
---|---|
Literal |
Matches a previously set string. |
Section |
Matches anything between two previously specified characters. |
Pattern |
Matches a specified Regex-Pattern. |
Name | Short Description |
---|---|
Ordered |
Matches a bunch of rules in their specified order. |
Unordered |
Matches all passed rules in any order. |
Alteration |
Matches any one of the passed rules. |
Multiple |
Matches the passed rule as many times as possible. |
Optional |
Tries to match the passed rule or an empty string. |
Lazy |
Matches the passed rule lazily to enable circular dependencies. |
This is the EBNF for a simple mathematical expression:
nr := '-'? ('0' | ... | '9')+
op := '+' | '-' | '*' | '/'
exp := nr (op nr)*
And this is the implementation with this library (the boolean-literal describes the optionality of the rule):
Rule nr = new Pattern("-?[0-9]");
Rule op = new Alteration(false, "+", "-", "*", "/");
Rule exp = new Ordered(nr, new Multiple(true, op, nr));
Now calling exp.tokenize("10 + -20 * 147")
will yield the following tree:
"TokenArray": {
"LiteralToken": "10",
"TokenList": [
"TokenArray": {
"LiteralToken": "+",
"LiteralToken": "-20"
},
"TokenArray": {
"LiteralToken": "*",
"LiteralToken": "147"
}
]
}
Meanwhile calling exp.tokenize("10 xxx * 147")
will result in:
"TokenArray": {
"LiteralToken": "10",
"TokenList": [
"ErrorToken": "xxx", <-- Error
"TokenArray": {
"LiteralToken": "*",
"LiteralToken": "147"
}
]
}
This project requires:
- Java jdk18
- The helper-library