eval.py reports higher than 100 aligned accuracy on enhanced dependencies
AngledLuffa opened this issue · 2 comments
ELAS and EULAS scores are higher than 100:
python3 eval.py UD_English-EWT/en_ewt-ud-train.conllu UD_English-EWT/en_ewt-ud-train.conllu -v
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 100.00 | 100.00 | 100.00 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 100.00 | 100.00 | 100.00 |
UPOS | 100.00 | 100.00 | 100.00 | 100.00
XPOS | 100.00 | 100.00 | 100.00 | 100.00
UFeats | 100.00 | 100.00 | 100.00 | 100.00
AllTags | 100.00 | 100.00 | 100.00 | 100.00
Lemmas | 100.00 | 100.00 | 100.00 | 100.00
UAS | 100.00 | 100.00 | 100.00 | 100.00
LAS | 100.00 | 100.00 | 100.00 | 100.00
ELAS | 100.00 | 100.00 | 100.00 | 105.02 <---
EULAS | 100.00 | 100.00 | 100.00 | 105.02 <---
CLAS | 100.00 | 100.00 | 100.00 | 100.00
MLAS | 100.00 | 100.00 | 100.00 | 100.00
BLEX | 100.00 | 100.00 | 100.00 | 100.00
If I had to guess without actually looking at the code, maybe it's getting extra credit for lines where there is more than one enhanced dependency to count?
Also, this happens if I do git checkout 799292f54c699fd2ccf90b0b890a0533ccf35fd4
in order to go earlier than my recent changes, so definitely not my fault :P
My intuition is 100% correct:
count of aligned lines, ignoring multiplicity:
Line 506 in 77500d7
possibility of multiple +1 for a single line:
Line 513 in 77500d7
I'd fix it, but I don't know what we should make "aligned accuracy" represent in this case, if anything. Perhaps an empty column is the most appropriate?
Thanks for reporting. I agree that aligned accuracy does not make sense here. Fixed.