haskell-hvr/regex-pcre

Incorrect matching of Unicode strings

Opened this issue · 0 comments

I think that strings that contain Unicode characters are incorrectly matched. For example, using regex-pcre in ghci gives us unexpected result:

> "Č123456" =~ "4" :: (String, String, String)
("\268\&1234","5","6")

Similar example without Unicode characters produces expected result:

> "C123456" =~ "4" :: (String, String, String)
("C123","4","56")

Tested with regex-pcre 0.95.0.0 and ghc 8.6.5

I should add that libpcre installed on my system is working fine.

This bug is probably related to incorrect matching of strings that contain unicode.