sparsetech/translit-scala

Derive rules from text corpus

tindzk opened this issue · 1 comments

The apostrophe rules for Ukrainian are currently based on a small sample of words. We should use a large corpus such as Wikipedia to derive more comprehensive rules that cover more exceptions.

The same should be done for Russian.

The rules were improved on a large sample of words as part of #12 and #13.