imcompatible `freqlemlivres` and `freqlivres`
alephpi opened this issue · 2 comments
alephpi commented
Hi, thanks for your awesome work!
However, when I use Lexique383.tsv, I observe the following:
From the manual I understand the freqlemlivres
should be the frequency of lemma of the word and freqlivres
should be the frequency of the word, right?
But as we see in the table, the lemma of danse
(35155), danser
(35158) and danseur
(35172) are themselves, while these two fields are not equal. Why?
chrplr commented
Hello,
"danse - Noun", "danse - Verb" and "danseur - Noun" are different lemmas
according to the parser we used.
92.57 is the sum of frequencies of all the derivations of the verb "danser"
25.68 is the sum of frequencies of all the derivations of the noun
"danseur" (danseur singular + plural + feminine sing. + feminine plur.)
35.27 is the sum of freq of the derivatives of "danse" (danse singular +
danse plural)
(and as far as I can see, freqlemlivres is indeed the sum of the
relevant freqlivres)
I am not sure about what you did expect (?)
…--
Christophe Pallier (http://www.pallier.org)
INSERM Cognitive Neuroimaging Lab (http://www.unicog.org)
On Sun, May 21, 2023 at 12:44 PM 润心 ***@***.***> wrote:
Hi, thanks for your awesome work!
However, when I use Lexique383.tsv, I observe the following:
[image: image]
<https://user-images.githubusercontent.com/61275421/239735119-a34060ee-ce53-4b9c-b774-982ee2046715.png>
From the manual I understand the freqlemlivres should be the frequency of
lemma of the word and freqlivres should be the frequency of the word,
right?
But as we see in the table, the lemma of danse(35155), danser(35158) and
danseur(35172) are themselves, while these two fields are not the equal.
Why?
—
Reply to this email directly, view it on GitHub
<#18>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALVWMWUA3NDZGIK5TC3VS3XHHWXJANCNFSM6AAAAAAYJKFADE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
alephpi commented
Yeah, you're right. I thought the freqlemlivres
and freqlivres
of danse
should be equal, but it turns out that freqlemlivre is actually the sum of all the words whose lemma is danse
.