tree-sitter/tree-sitter-c

bug: Parsing error on the function definition

Raghava-Ch opened this issue · 3 comments

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-c

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

No response

Describe the bug

While parsing the function with STATIC, INLINE, BOOL, MODULE_NAME, which are defined showing semicolon missing error example is given below.

However I have done some work around which can be used until this issue is fixed.

Function definition grammar

function_definition: ($) => seq(
        optional($.ms_call_modifier),
        repeat1($._declaration_specifiers),
        field("declarator", $._declarator),
        field("body", $.compound_statement),
      ),

Conflicts entries:

[$.function_definition, $.declaration, $._old_style_function_definition],
[$.function_definition, $._old_style_function_definition],
[$.declaration, $.function_definition],

Steps To Reproduce/Bad Parse Tree

Just paste the given code in the tree-sitter playground, And observe the AST for the parse error.

Expected Behavior/Parse Tree

Expected to parse the code without errors, like below example.

STATIC INLINE int do_stuff(int arg1) {
  return 5;
}
(translation_unit
  (function_definition
      type: (type_identifier)
      type: (type_identifier)
      type: (primitive_type)
      declarator: (function_declarator
        declarator: (identifier)
        parameters: (parameter_list
          (parameter_declaration
            type: (primitive_type)
            declarator: (identifier))))
      body: (compound_statement
        (return_statement
          (number_literal)))))

Repro

STATIC INLINE int do_stuff() {
  return 5;
}
amaanq commented

custom macros won't generally have great parsing output since tree-sitter isn't context aware, sorry.

Hi amaanq,
Sorry for writing again.
I actually observed this pattern of function definitions in most embedded systems code bases.
Companies usually prepend module name for function definitions.
Ex:

#define MODULE_NAME
#define LOCAL static

MODULE_NAME LOCAL int foo() {
}

I understand tree-sitter don’t know the context, but expected at-least to parse without errors.

the workaround I provided at-least parse without errors and it didn’t break any existing test cases.

I kindly request again to reconsider.

amaanq commented

We can't cherry pick conventions used in specific sectors - that'll just open the gates for anyone in any environment/sector to pitch for their own popular macros that are used. If I were you, and I've done this before for specific C parsing needs, like with IDA Pro's decompiler output of C code, I would fork this, edit in the qualifiers/modifiers you need, and use that.