joelspadin/tree-sitter-devicetree

Unable to parse comments which don't contain whitespace

nickcoutsos opened this issue · 2 comments

Is this behaviour correct? I was surprised because of how common it is for people to embed ASCII layout diagrams in their keymaps as comments but looking back I can't find any that have no whitespace anywhere after the //.

I can see how it might get confused by the parser since devicetree identifiers can support slashes, but I tried including such a comment in a ZMK keymap and it seemed to build without complaint.

Here are some node REPL tests for various combinations of comments with/without whitespace and how they are recognized:

Success

> parser.parse('//foo bar').rootNode.toString()
'(document (comment))'
> parser.parse('//foo ').rootNode.toString()
'(document (comment))'
> parser.parse('// foo ').rootNode.toString()
'(document (comment))'

Error

> parser.parse('//foo').rootNode.toString()
'(document (ERROR (identifier)))'
> parser.parse('//foo\n').rootNode.toString()
'(document (ERROR (identifier)))'

P.S. Thanks for creating this! I'm still trying to come to grips with the tree-sitter API but having this grammar has been incredibly helpful in my project!

I was looking at things again, and I see that the pattern for a node path seems like it would overshadow even a simpler comment regular expression.

Would it make sense to just move up the rule for comments? Making the change locally I see the tests pass (even changing an existing comment to something like //foo) but I don't know if the existing tests make use of node path.

Looking at the devicetree specification the only mention of comments is

C style (/* ... */) and C++ style (//) comments are supported.

I've been meaning to go back and rewrite the syntax to match https://github.com/dgibson/dtc/blob/main/dtc-lexer.l instead of just reverse engineering it from documentation and file samples. Your change looks good as a quick fix for now though. Thanks!