regexp-to-ast
Reads a JavaScript Regular Expression literal(text) and outputs an Abstract Syntax Tree.
Installation
- npm
npm install regexp-to-ast
- Browser
<script src="https://unpkg.com/regexp-to-ast/lib/parser.js"></script>
API
The API is defined as a TypeScript definition file.
Usage
-
Parsing to an AST:
const RegExpParser = require("regexp-to-ast").RegExpParser const regexpParser = new RegExpParser() // from a regexp text const astOutput = regexpParser.pattern("/a|b|c/g") // text from regexp instance. const input2 = /a|b/.toString() // The same parser instance can be reused const anotherAstOutput = regexpParser.pattern(input2)
-
Visiting the AST:
// parse to an AST as before. const { RegExpParser, BaseRegExpVisitor } = require("regexp-to-ast") const regexpParser = new RegExpParser() const regExpAst = regexpParser.pattern("/a|b|c/g") // Override the visitor methods to add your logic. class MyRegExpVisitor extends BaseRegExpVisitor { visitPattern(node) {} visitFlags(node) {} visitDisjunction(node) {} visitAlternative(node) {} // Assertion visitStartAnchor(node) {} visitEndAnchor(node) {} visitWordBoundary(node) {} visitNonWordBoundary(node) {} visitLookahead(node) {} visitNegativeLookahead(node) {} // atoms visitCharacter(node) {} visitSet(node) {} visitGroup(node) {} visitGroupBackReference(node) {} visitQuantifier(node) {} } const myVisitor = new MyRegExpVisitor() myVisitor.visit(regExpAst) // extract visit results from the visitor state.
Compatibility
This library is written in ES5 style and is compatiable with all major browsers and modern node.js versions.
TODO / Limitations
- Use polyFill for string.prototype.at to support unicode characters outside BMP.
- Descriptive error messages.
- Position information in error messages.
- Support unicode flag escapes.
- Ensure edge cases described in "The madness of parsing real world JavaScript regexps" are supported.
- Support deprecated octal escapes