aff-regex
doublex opened this issue · 6 comments
This AFF (czech) contains a wrong regex:
https://github.com/wooorm/dictionaries/blob/main/dictionaries/cs/index.aff#L2119
Therefore this line fails re.error: unterminated character set at position 36
https://github.com/zverok/spylls/blob/master/spylls/hunspell/data/aff.py#L266
What are you suggesting here? What's the desired behavior for definitely-wrong dictionary files?
You are right - the problem is the affix file.
But maybe there is an issue, this affix looks correct but fails:
https://github.com/wooorm/dictionaries/blob/main/dictionaries/uk/index.aff#L1464
@doublex Ugh, this is more complicated. It seems I've never encountered dictionaries with ()
in conditions before, even when running smoke tests on all dictionaries that were available at the moment of spylls finalization (not even sure if Hunspell supports this syntax). I'll try to take a closer look in the next days.
They are a rare (strange?) case. Maybe simply remove ()
?
Surprisingly enough, this case, while indeed rare, made me rethink a bit why it is a problem... And simplify code for it not be it anymore :)
See f92f74b — there are significant simplifications in spylls/hunspell/data/aff.py
, dropping the hacky regexp construction.
Released as 0.1.7, works with uk_UA as expected.
Thanks a lot for all your efforts!