Evaluation script gives error about mismatch of line numbers
Closed this issue · 1 comments
erip commented
I'm running the evaluation script on my en-de system according to the steps in the LREC2020
directory. This test set has 35315 examples which is how many I've translated:
$ wc -l my_out.de.tok src_segmented.txt <(zcat en-de.test.txt.gz)
35315 ms.de.tok
35315 src_segmented.txt
35315 /dev/fd/63
105945 total
However, the eval script tells me that my lines mismatch:
$ cat eval.sh
#!/usr/bin/env bash
python3 evaluate.py \
--ref-testsuite en-de.test.txt.gz \
--sense-file senses.en-de.txt \
--dist-file distances.en-de.txt \
--src-segmented src_segmented.txt \
--tgt-segmented my_out.de.tok \
--tgt-lemmatized my_out.de.conllu
$
$ bash eval.sh
Number of sentences does not match
Reference file: 35315
Segmented source file: 43481
Lemmatized system output: 43481
Segmented system output: 43481
When I print line
before this message, I see defaultdict(<class 'int'>, {'total': 43481, 'missing_ref': 8166})
, but there don't seem to be any missing refs:
$ zcat en-de.test.txt.gz | cut -f5 | grep "^$" | wc -l
0
Is there something obviously wrong here?
erip commented
OK, I found the issue. My CONLL file was being sentence-split on some punctuation (namely ;
) so replacing \n\n
with \n
to account for the over-splitting seems to have gotten things into better shape!