zkemail/zk-regex

Handle international characters correctly

Divide-By-0 opened this issue · 0 comments

We can directly capture the multiple bytes like this in the regex definition but not the static regex string

regex definition e.g. (\u00c3|\u00c4|\u00c5)+ translates to a multi_or of 195, 196, 197 in circom which is correct

but we cant do this in the static string e.g. \u00c3\u00c4\u00c5, so we have to manually update the circom code ourselves. would be good to enable this in the script at least