Match misplace by 2
YarikMR opened this issue · 2 comments
YarikMR commented
When the ’ (RIGHT SINGLE QUOTATION MARK ; HTML ’ HEX 0x2019 ) is in the string to search the span is displace by 2, for example:
string = "it’s your car"
regex = re2.compile('you')
match = regex.search(string)
pather_match = string[match.start():match.end()]
print(pather_match)
'ur c'
sfilipco commented
Hi @YarikMR. I am sorry, but at this time we do not plan to add explicit support for unicode.
The workwarond that I suggest is:
string.encode()[match.start():match.end()].decode()
sfilipco commented
If you want to sent a pull request, feel free to reopen the issue. Thank you!