javallone/regexper-static

Octal character literal is mistaken for backreference

Opened this issue · 1 comments

The expression /\101/ matches the character with the ASCII code octal 101 (decimal 65), which is A.

Here's a test: https://jsfiddle.net/ekaeb61p/

However, Regexper wrongly interprets the above pattern as meaning a backreference to group 1 (the \1), followed by a literal "01": https://regexper.com/#%2F%5C101%2F

This and #38 interact in some really interesting ways.

First of all, it looks like octal escapes are deprecated in ES5, so once I have these parsed correctly I'll add a warning about using them as well. That doesn't make this not a bug though...

It looks like octal sequences can be masked by back-references, so for example /a\15/ will match "a\r", but /(((((((((((((((a)))))))))))))))\15/ matches "aa". My guess is that this is probably part of the reason for the deprecation of octal escapes.

Right now, the parser doesn't take into account the number of available capture groups when determining if an escape is a back-reference or an octal escape. I also don't think I can add the intelligence required in the current setup. I'm going to be starting a rewrite soon, and part of the plan is to drop canopy as a parser-generator and just write the parser instead, so I should be able to add the necessary conditions to parse this out correctly.