comby-tools/comby

C language is including ';' in expression syntax

charles-gray opened this issue · 2 comments

Describe the bug
When I try to use a hole with expression-like syntax (:[foo:e]) in C code, I see the ; semicolon character included in the expression. ; isn't a valid expression token in C.

Reproducing

This link showcases some examples of it matching the ; and some ways to break it.

bit.ly/3UNAuki

Expected behavior
I expect to be able to match an expression without the trailing semicolon.

Additional context
The same is true for the comma (,) token. Though that can be part of an expression depending on context. I'm not sure if I expect comby to be smart enough to tell the difference, so I'm not sure that's included in this bug report.

Hi @charles-gray! The explanation is that this really is expression-like and not strictly that expected C-expression matching. Comby is not smart enough to tell the difference. The examples that "break" that matching are what are considered non-expression tokens (spaces at the top-level, i.e., not inside (...), and comments)

Note that in many languages (and I think C is included here), a syntactic statement ending in a semicolon is considered an expression. So in the strictly C-expression matching of your examples, I would expect the behavior to always match the trailing ; (rather than never, if I am following what you would expect).

As a workaround, you can look at stripping or fine-tuning matching the ; with a regular expression matcher, since this is probably a lexical concern most of the time.

Feel free to close if this answers your question :-)

Thanks for the prompt response!

I've always assumed the ; was part of a statement, not an expression. Grabbing the first google result for a C grammar I can read, the use case I'm looking at falls under an "expression-statement", so I guess we're both right.

I guess my question then is, I see that comby supports custom language definitions. I'd love to tweak the C definition to see if I can bend it to my current use case (I've encountered this semicolon problem before!). The C definition in the comby source seems to be hard-coded in ML. I was wondering if there's a way to spit out the definition for C in JSON, or there's a reference example somewhere I can adapt?