skvadrik/re2c

are quotes mandatory around literals?

Closed this issue · 4 comments

Most regex parsers allow you to write a regex like this (maybe without surrounding quotes depending on the language):
re = "literal[A-Z]+"
You don't have to tell the regex parser that literal is a string literal.

For what I have look in the docs, re2c expects the same regex to be written like:
re = "literal" [A-Z]+
Which is confusing, because people writing regexes have to keep this in mind.

Is there a simpler way? Or a tool similar to re2c that might do what I expect?
Thanks in advance.

Hi, re2c has an option --flex-syntax that allows unquoted string literals.

Similar tool that you may want to consider is e.g. Flex.

It seems that --flex-syntax treats the whole regex as a string literal. Am I missing something?

No, --flex-syntax allows character classes and the usual regex operators (alternative, star, plus, etc.). If you have an example grammar that is not working you can post it here and I'll try to help.

As a general note, it is traditional that lexical analyzer generators use non-standard regex notation because of the inclusion of macros, the needs of lexers, etc.