parser hang when typing integer value in function body
wingyplus opened this issue ยท 10 comments
Reproduced by test corpus:
================
Function clause bug
================
-module(amod).
fibonacci(1) ->
1
---
(source_file ())
The result from tree-sitter test:
$ yarn tree-sitter test -f 'Function clause bug'
comments:
expr_function_call:
term_string:
macros:
term_char:
expr_list:
advent_of_code_2020:
term_tuple:
otp_lists:
term_atom:
module_attribute:
expr_case:
expr_exceptions:
expr_bit_string:
term_float:
term_integer:
expr_map:
expr_send:
types:
term_binary_string:
expr_receive:
expr_variable:
term_fun:
term_list:
term_record:
expr_if:
function_clause_bug:
<< hang
Thanks! I'll add that test case and look into it ๐๐ผ
Btw is that neovim on your screenshot?
Sweet!
What would the expected behavior be here? I presume you'd want a "MISSING" error.
Sweet!
What would the expected behavior be here? I presume you'd want a "MISSING" error.
It would be like you presume or tokenized correctly (in the test case above, it should be a number, I guess). And it should be able to continue typing not hang up.
I found that the parser not hang up after comment $.expr_operator
out in L313.
yarn run v1.22.10
$ /Users/thanabodee/src/github.com/wingyplus/tree-sitter-erlang/node_modules/.bin/tree-sitter parse amod.erl -d
(source_file [0, 0] - [4, 0]
(module_name [0, 0] - [0, 14]
(atom [0, 8] - [0, 12]
value: (unquoted_atom [0, 8] - [0, 12])))
(ERROR [2, 0] - [3, 3]
(atom [2, 0] - [2, 9]
value: (unquoted_atom [2, 0] - [2, 9]))
(pattern [2, 10] - [2, 11]
(term [2, 10] - [2, 11]
(integer [2, 10] - [2, 11])))
(ERROR [3, 2] - [3, 3])))
amod.erl 0 ms (ERROR [2, 0] - [3, 3])
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
NOTE: the amod.erl
is the example from my screenshot.
Hm, the way we're using the choice
operator may be too expensive / run too many branches?
Could you try this approach where we'd be flattening out the branching (right now its in 2 levels for us)?
so instead of
expr_operator_binary: ($) =>
choice(
prec.left(
PREC.BINARY_OP,
seq(
field("lhs", $.expression),
field("operator", oneOf(OP2_LEFT_ASSOC)),
field("rhs", $.expression)
)
),
prec.right(
PREC.BINARY_OP,
seq(
field("lhs", $.expression),
field("operator", oneOf(OP2_RIGHT_ASSOC)),
field("rhs", $.expression)
)
)
),
we'd have something like:
expr_operator_binary: ($) => choice(
...[
...OP2_LEFT_ASSOC.map( op => [ "left", op ])
...OP2_RIGHT_ASSOC.map( op => [ "right", op ])
].map( [dir, op] =>
prec[dir](
PREC.BINARY_OP,
seq(
field("lhs", $.expression),
field("operator", op),
field("rhs", $.expression)
)
)
),
It's still not working:
diff --git a/grammar.js b/grammar.js
index b4205ba..5c1d984 100644
--- a/grammar.js
+++ b/grammar.js
@@ -405,20 +405,17 @@ module.exports = grammar({
expr_operator_binary: ($) =>
choice(
- prec.left(
- PREC.BINARY_OP,
- seq(
- field("lhs", $.expression),
- field("operator", oneOf(OP2_LEFT_ASSOC)),
- field("rhs", $.expression)
- )
- ),
- prec.right(
- PREC.BINARY_OP,
- seq(
- field("lhs", $.expression),
- field("operator", oneOf(OP2_RIGHT_ASSOC)),
- field("rhs", $.expression)
+ ...[
+ ...OP2_LEFT_ASSOC.map((op) => [prec.left, op]),
+ ...OP2_RIGHT_ASSOC.map((op) => [prec.right, op]),
+ ].map(
+ ([_prec, op]) => _prec(
+ PREC.BINARY_OP,
+ seq(
+ field("lhs", $.expression),
+ field("operator", op),
+ field("rhs", $.expression)
+ )
)
)
),
$ yarn tree-sitter generate && yarn tree-sitter parse amod.erl -d
yarn run v1.22.10
$ /Users/thanabodee/src/github.com/wingyplus/tree-sitter-erlang/node_modules/.bin/tree-sitter generate
โจ Done in 0.77s.
yarn run v1.22.10
$ /Users/thanabodee/src/github.com/wingyplus/tree-sitter-erlang/node_modules/.bin/tree-sitter parse amod.erl -d
new_parse
process version:0, version_count:1, state:1, row:1, col:0
lex_internal state:47, row:1, column:0
consume character:'-'
lexed_lookahead sym:-, size:1
shift state:4
process version:0, version_count:1, state:4, row:1, col:1
lex_internal state:47, row:1, column:1
consume character:'m'
consume character:'o'
consume character:'d'
consume character:'u'
consume character:'l'
consume character:'e'
consume character:'m'
consume character:'o'
consume character:'d'
consume character:'u'
consume character:'l'
consume character:'e'
lexed_lookahead sym:module, size:6
shift state:830
process version:0, version_count:1, state:830, row:1, col:7
lex_internal state:0, row:1, column:7
consume character:'('
lexed_lookahead sym:(, size:1
shift state:539
process version:0, version_count:1, state:539, row:1, col:8
lex_internal state:0, row:1, column:8
consume character:'a'
consume character:'m'
consume character:'o'
consume character:'d'
consume character:'a'
lexed_lookahead sym:unquoted_atom, size:4
shift state:11
process version:0, version_count:1, state:11, row:1, col:12
lex_internal state:45, row:1, column:12
consume character:')'
lexed_lookahead sym:), size:1
reduce sym:atom, child_count:1
shift state:790
process version:0, version_count:1, state:790, row:1, col:13
lex_internal state:47, row:1, column:13
consume character:'.'
lexed_lookahead sym:., size:1
shift state:397
process version:0, version_count:1, state:397, row:1, col:14
lex_internal state:47, row:1, column:14
skip character:10
skip character:10
consume character:'f'
consume character:'i'
consume character:'b'
consume character:'o'
consume character:'n'
consume character:'a'
consume character:'c'
consume character:'c'
consume character:'i'
consume character:'f'
lexed_lookahead sym:unquoted_atom, size:11
reduce sym:module_name, child_count:6
shift state:11
process version:0, version_count:1, state:11, row:3, col:9
lex_internal state:45, row:3, column:9
consume character:'('
lexed_lookahead sym:(, size:1
reduce sym:atom, child_count:1
shift state:394
process version:0, version_count:1, state:394, row:3, col:10
lex_internal state:47, row:3, column:10
consume character:'1'
lexed_lookahead sym:integer_token3, size:1
shift state:12
process version:0, version_count:1, state:12, row:3, col:11
lex_internal state:45, row:3, column:11
consume character:')'
lexed_lookahead sym:), size:1
reduce sym:integer, child_count:1
reduce sym:term, child_count:1
reduce sym:pattern, child_count:1
shift state:807
process version:0, version_count:1, state:807, row:3, col:12
lex_internal state:0, row:3, column:12
skip character:' '
consume character:'-'
consume character:'>'
lexed_lookahead sym:->, size:3
shift state:84
process version:0, version_count:1, state:84, row:3, col:15
lex_internal state:47, row:3, column:15
skip character:10
skip character:' '
skip character:' '
consume character:'1'
lexed_lookahead sym:integer_token3, size:4
shift state:215
process version:0, version_count:1, state:215, row:4, col:3
lex_internal state:2, row:4, column:3
skip character:10
I've narrowed it down to that rule being the one that takes an infinity to parse. If you keep the unary operator expression one things should work fine.
So I'll be reworking that one soon :) thanks for finding this!
@wingyplus I've done 2 rewrites of the thing and I can't seem to get it to work well.
It seems that supporting both free-standing expressions and a proper module, clashes.
E.g, the Erlang expression language is embedded into the Erlang module language.
So we may need to provide 2 separate grammars.