tunnckoCore/parse-function

hints and ideas for improving

tunnckoCore opened this issue · 5 comments

ref #5 #4 and #2

We should detect that there's new line and mark that fact, then we can detect of ch is after some comment opening like // or /*.. and hm.

Okey. Other approach can be using snapdragon, but we strongly need benchmark tests first before any steps forward.

//cc @eush77 @cmtt

I think we need some expression parsing anyway to handle ES6 default parameters:

function (opts = { foo: { done: (x) => console.log({ value: x }) } }) {
  // ...
}

However, we don't actually need a full-fledged ES parser (like acorn) for that, and basic preparsing would allow us to skip default parameters and capture argument names very quickly.

I'm not familiar with snapdragon, but simple recursive descent would definitely work. For that we need to implement several simple tokenizers for identifiers, strings, numbers, punctuation, and comments.

Then I imagine the whole process to be something like this:

  1. Start tokenizing, see if the subject is an arrow function, generator, or plain ES5 function.
  2. Tokenize function name (an identifier).
  3. Parse argument list:
    1. Tokenize identifier (and save its value).
    2. Tokenize , or =.
    3. If = (argument has a default param), skip the following expression (just keep track of parentheses really, we don't need to actually parse anything).
  4. Save the rest of the string as the function body and stop.

This is some work, but not that hard, since we don't need to construct an AST. We can allow a much more loose grammar than EcmaScript defines:

function (opts = { foo foo:: { bar / } }) {
  // ...
}

Hmmm. Okey, I made some benchmarks tests and with regex is 40-60x faster, lol. So we will back to regex approach. I already have ready regex that handles regular and arrow functions.

@tunnckoCore What I'm saying is that regexes won't ever support ES6 default parameters feature (see #8).

There are many modules that do regex-based parsing (js-args-names is one example), but they all are broken for ES6. We can do better and be the first module that works correctly with ES6.

> parseFunction('function (opts = { foo: { done: (x) => console.log({ value: x }) } }, cb) { /* ... */ }')
{ name: 'anonymous',
  body: ' foo: { done: (x) => console.log({ value: x }) } }, cb) { /* ... */ ',
  args: [ 'opts' ],
  params: 'opts' }

@eush77 you're right. PRs always welcome :)

Again, very thanks! Hope this helps! Review the changelog.