hipe-eval/HIPE-scorer

There are no tags in the system response for the column...

creat89 opened this issue · 3 comments

Hello,

Currently I'm having an issue doing some internal evaluations when my system only predicts a type of NER. The scorer stops when I do not provide labels for all the types of columns. In my opinion, if the user do not provide labels for a specific column it should return zero in the evaluation of that column rather than stopping it.

Hello @creat89,

Thanks for your report. Yet, I cannot reproduce the error. What do you mean by "stopping"? Do you get an error? The following test seems to work as expected:

I replaced the NE-FINE-LIT with X with:

awk -vOFS='\t'  '{$4 = "X"; print}' data/release/v1.2/de/HIPE-data-v1.2-test-de.tsv > issue_8.tsv

Then I run the scorer with:

python ../CLEF-HIPE-2020-scorer/clef_evaluation.py --ref data/release/v1.2/de/HIPE-data-v1.2-test-de.tsv --pred issue_8.tsv --task nerc_fine --outdir data/system-evaluations --log issue_8.log

The scorer correctly complains about the missing column in this case:

The provided annotation columns ['NE-FINE-LIT'] are not available in both the gold standard and the system response 'issue_8.tsv'.

However, it runs through when you just provide an empty column with all the required fieldnames (e.g. 'NE-FINE-LIT').

Please provide more information if you think something is wrong.

For instance, I have the following file:

TOKEN	NE-COARSE-LIT	NE-COARSE-METO	NE-FINE-LIT	NE-FINE-METO	NE-FINE-COMP	NE-NESTED	NEL-LIT	NEL-METO	MISC
# language = de
# newspaper = NZZ
# date = 1798-01-17
# document_id = NZZ-1798-01-17-a-p0002
# segment_iiif_link = _
Rußland	B-loc	O	O	O	O	O	_	_	_
.	O	O	O	O	O	O	_	_	_
Petersburg	B-loc	O	O	O	O	O	_	_	_

where my system only predicted labels for the column NE-COARSE-LIT but not for NE-COARSE-METO. In other words, for NE-COARSE-METO I just printed O. If I run the script:

python clef_evaluation.py --ref /home/HIPE-data-v1.0-dev-de.tsv --pred /home/Predictions_dev.tsv --skip_check --task nerc_coarse

The script tells me:

There are no tags in the system response file '/home/adrian/Programs/NER_BERT_News/Server_Models3/NER_models_europeana_german_fixed_nofixtags_weightedlosslogAlpha/Impresso_de_v1,0_8_5e-05//Predictions_test.tsv' for the column: ['NE-COARSE-METO']

And does not produce any output. However, in this case, it should indicate that for NE-COARSE-METO the score is zero rather than stopping the script.

However, it runs through when you just provide an empty column with all the required fieldnames (e.g. 'NE-FINE-LIT').

Thus, that this means that instead of having O in the second column I just need to put it empty?

Thanks for clarifying. With b8bdbf8, the behavior is changed. As of now, the scorer logs missing tags only without quitting the evaluation.