-RUS tag vs -RUS-BACK and -RUS_FRONT
mansayk opened this issue · 4 comments
Hi!
I fixed 2 twol rules, because some cases didn't work, for example, часть, частьны, частька.
79a5b1a
Currently we marked with RUS tag only those loanwords, that accept affixes with back vowels. That means, that we cannot mark with RUS tag loanwords accepting affixes with front vowels. What is the best solution here? Maybe we need to replace -RUS tag with 2 different ones: -RUS-BACK and -RUS-FRONT?
Russian words that take front-vowel endings all have front vowels in their final syllables or end in palatalised consonants, right? Which means they trigger normal Tatar front-vowel endings [more or less] just like native Tatar words? If this is the case, then just use the normal word classes, like N1
. This "default" precludes the need for a separate lexicon. Are there any examples that won't work this way? We can probably account for any specific exceptions with further clarifications to the twol rules.
Also, are you sure those changes to the twol rules didn't break other things?
I made a test based on words collected from the corpus:
https://github.com/apertium/apertium-tat/tree/master/tests-tatcorpus
It can help to control regression.
Hmm I don't think a new testing framework is what we need here. We already have several testing frameworks in place. I can explain the yaml
-based one; the others probably @IlnarSelimcan can explain.
Actually it is not for testing, but to control effect of code changes in a big dictionary. You don't need to use it, but I will use it to find new words and to see changes in words that are not in yaml files yet.