/cherow

Very fast, standards-compliant, self-hosted ECMAScript parser with high focus on both performance and stability

Primary LanguageTypeScriptISC LicenseISC

Cherow

NPM version Gitter chat Build Status Coverage Status

A very fast, standards-compliant, self-hosted ECMAScript parser with high focus on both performance and stability.

It strictly follows the ECMAScript® 2017 Language Specification and should parse according to these specifications.

Features

  • Full support for ECMAScript® 2017 (ECMA-262 8th Edition)
  • ECmaScript Next (Stage 3 proposals)
  • JSX, a syntax extension for React
  • Skips hashbang comment nodes by default
  • Optimized for handheld devices
  • Optional tracking of syntax node location (index-based and line-column)
  • Parameterized plugin system
  • 8600 unit tests

ESNext features

Stage 3 features support. These need to be enabled with the next option.

Options

Option Description
comments Let you collect comments. Accepts either an array or function
directives Allow use of the ESTree directive node
globalReturn Enable return in global scope
impliedStrict Enable global strict mode in sloppy mode
jsx Enable JSX parsing
locations Attach line/column location information to each node
ranges Attach range information to each node
next Allow experimental ECMAScript features - stage 3 proposals
plugins Let you add an array of plugins
raw Attach raw property on literal nodes (Esprima and Acorn feature)
sourceType Specify which type of script you're parsing ("script" or "module")

API

A JavaScript program can be either a script or a module and both are accepted by Cherow to perform syntactic analysis of JavaScript programs.

// Parsing script
cherow.parseScript('const fooBar = 123;');

// Parsing module code
cherow.parseModule('const fooBar = 123;');

Parsing with options

// Parsing script
cherow.parseScript('const fooBar = 123;', { ranges: true, raw: true, next: true});

Comments and comment collection

Single line, multiline and HTML comments are supported by Cherow, and the parser can be instructed to collect comments by setting the comments option to either an array or an function.

The type of each comment can either be Line for a single-line comment (//) og Block for a MultiLineComment (/* */).

Note that if the location tracking isn't enabled, an empty object will be returned, and if the ranges option isn't set - undefined will be returned.

A function will be called with the following parameters

  • name - Either Line or Block
  • comment - The content of the comment
  • start - Character offset of the start of the comment.
  • end - Character offset of the end of the comment.
  • loc - Column and line offset of the comment

Study the following examples to better understand how to collect comments:

// Function
cherow.parseScript('// foo',
   {
       comments: function(name, comment, start, end, loc) {}
   }
);

// Array
const commentArray = [];

cherow.parseScript('// foo',
    {
        comments: commentArray
    }
);

Plugins

Cherow is designed to support parameterized plugins wich, within reasonable bounds, redefine the way the parser works. A parameterized plugin gives you far more benefits than a traditional one , and let you extend the parser with code from 3rd party libraries or simply let you create a walker function.

Note that the plugin options takes only an array of plugins [ plugin1(args...), plugin2(args...), plugin3(args...)]

After the parser object has been created, the initialization functions for the chosen plugins are called with the (parser) argument.

function plugin() {
    return (parser) => {
      // your plugin code
   }
}

Create a plugin

Here is a simple example plugin wich creates a new literal node with a pre-defined value 123.

// Create a new plugin
function plugin(value) {
    return (parser) => {
        parser.parseLiteral = function(context) {

            // Get the start pos of line, column
            const pos = this.getLocations();

            // Call for the next token in the stream
            this.nextToken(context);

            return this.finishNode(pos, {
                type: 'Literal',
                value // The value will be '123'
            });
        }
    }
}

// Parse with the new plugin enabled
parseScript('1', {
    plugins: [
        plugin(123);
    ]
});

You can find and try the plugin example in the cherow-dummy-plugin repo repo

Rationale

Existing parsers have many issues with them:

Acorn is the most commonly used tool out there because of its support for recent ES standards, but it's slow and it often is too permissive in what it accepts. It's also a bit bloated.

Esprima is faster than Acorn, but only recently added async function support, and it misses some edge cases.

Babylon is highly coupled to Babel, and is comparatively very slow and buggy, failing to correctly handle even stable ECMAScript standard features.

None of these parsers would fare any chance against the official Test262 suite, and most fail a substantial number of them. Also, more and more JS tools require parsing support, and slower parsers result in slower tools. ESLint already spends a significant portion of its time parsing, often upwards of 1/4 of its time.

Bug reporting

If you caught a bug, don't hesitate to report it in the issue tracker. From the moment I respond to you, it will take maximum 30 minutes before the bug is fixed. Note that I will try to respond to you within one hour. Sometimes it can take a bit longer. I'm not allways online. And if I find out it will take more then 30 minutes to solve your issue, you will be notified.

I know how irritating it can be if you are writing code and encounter bugs in your dependencies. And even more frustrating if you need to wait weeks or days.

Contribution

If you feel something could've been done better, please do feel free to file a pull request with the changes.

Read our guidelines here