lezer-parser/javascript

Parenthesized assignment expressions fail to parse after a certain AST size

Closed this issue · 2 comments

Side-note: I apologize for filing all these bugs, I'm currently exploring replacing Acorn with Lezer in DevTools for pretty-printing and I encounter some bugs when parsing minified code. Feel free to close as "Wont'fix".

The following snippet fails to parse:

(a=b(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41, 42))

Note that it's sensitive to the number of arguments. Removing the 42 results in a successful parse. Adding more arguments fails the parse.

Note that it's not necessarily the argument list. The following snippet also fails the parse:

(a = function(b,c,d) {
  const e = 1 + 1 + 1;
  const f = 2 + 2 + 2;
  const g = 3 + 3 + 3;
  const h = 4 + 4 + 4;
  const i = 5 + 5 + 5;
  const j = 6 + 6 + 6;
  const k = 7 + 7 + 7;
  const l = 8 + 8 + 8;
}(1,2,3))

Again it's sensitive to the number of AST nodes inside the function. Deleting one ore more lines makes the parse successful. Adding more const m = ... keeps it failing.

I enabled the debug logging and seems that in the failing case it gets stuck trying to reduce with a ParamList (instead of an ArgList) for the CallExpression, but that might just be a red herring.

After a given amount of tokens in which multiple parses run alongside each other, @lezer/lr will drop one parallel parse even if both parses fully match the input (to avoid situations where huge stretches of input get parsed in multiple ways due to allowed ambiguity in the grammar). In this case this might be an arrow function parameter list or parenthesized expression, both of which could potentially go on, in a syntactically valid way, for megabytes (and even branch out to more inner ambiguities). I don't really see a way to avoid this issue, with our current architecture.

Ack, then let's close this bug if this is a known limitation.