funderburkjim/elispsanskrit

compare to Huet, nouns

Opened this issue · 4 comments

This issue summarizes differences and similarities noticed in the comparison of noun declensions; the methodology is described in the readme for huetcompare/nouns.

stem-gender comparison

Refer to huet_mwstems.

9822 stems are found in SL_nouns.xml, which lists inflected forms.
After taking into account homonyms and duplicate stems, there are 9728
stems remaining, which are shown also in huet_mwstems.

An attempt is made to match these stems to the stems found in two sources
of the elispsanskrit/pysanskrit system, namely a
list of noun stems from MW and a list of adjective stems from MW.

Matches on the stem are found in 8404 (86%) of the 9278 cases of huet_mwtest, while 1324 cases remain unmatched

For matching stems, no attempt thus far has been made regarding matching the gender information.

declension table comparison

Refer to compare_noun_tables.

From the 9728 stems of huet_mwstems, there are 18387 stem-gender variations. For each of these 18387 cases, the huet_noun_tables.txt file shows the declension table extracted from the SL_nouns.xml file.

A corresponding pysan_noun_tables.txt file shows declension tables derived by a computation using the pysanskrit declension code.

Then, the compare_noun_tables file compares the corresponding declension tables in the two files.

*The Huet inflected forms of SL_nouns.xml contain no vocative case inflected forms. *

Ignoring this systematic difference into account, about 87% (15987 cases out of 18387 cases) of the declension tables are otherwise identical in the two systems.

7% of the cases differ in the feminine gender declension. At least some of these cases are due to a different feminine stem. SL_nouns.xml does not explicitly provide a feminine stem. For instance,
the stem aNkana has inflected forms in SL_nouns for both neuter and feminine genders. The
pysanskrit computation of the feminine declension assumes, in the absence of further information,
that the feminine stem is the most usual kind for a noun ending in a, namely aNkanA. However,
in fact we see the Nominative Singular form aNkanI of the feminine declension of aNkana in SL_nouns.xml that Huet declines like nadI. The Pysanskrit computation did not make use of this
observation, so the pysan feminine declension of aNkana is entirely different from that in huet.

Based on this and a couple of other examples, I suspect that most of the 7% feminine declension differences depend on a determination of the appropriate feminine stem. In the aNkana example, I see no immediate justification for the I form of the feminine, but would not be surprised, given Huet's thoroughness, if there were some good justification.

In 100 (0.6%) of the cases, the Pysan computation failed. Many of these were adjectives ending in Iyas. I suspect that this deficiency is due to some peculiarity of the pysanskrit software, rather than a total inability of the software to generate declensions for such forms.

I see no immediate general observations in the cases of differences in the masculine and neuter declensions.

Comment on vocatives.

The vocative inflected forms are seen, along with some other forms, in SL_final.xml.

The Pysanskrit computation did not make use of this
observation, so the pysan feminine declension of aNkana is entirely different from that in huet.

So Pysanskrit more primitive in this regards.

peculiarity of the pysanskrit software

Yeah, would not take too much PC-power.

The vocative inflected forms are seen, along with some other forms, in SL_final.xml.

Hard to understand the logic, but still.