Evaluation question

Sorry for opening another issue again :)

Line 66 in dbc70a2

output_spans.add(Span(start, end, output[i][2:]))

For this evaluation part, what I have found before is that your evaluation using some useful tricks.

For this example:

I like Beijing University
O O B-City E-University

The results will be (Beijing University, University), it will be calculated correctly, am I right?

Thank you for your reply!

You are right. I think this is a pretty common issue in NER evaluation.
But if we use CRF, it's really less likely or even unlikely that it will happen something like "B-type 1" and "I-type 2". Especially when you add constraint (https://github.com/allanj/pytorch_neural_crf/blob/master/src/model/module/linear_crf_inferencer.py#L30-L59) in the CRF model.

But if that happens, every evaluation strategy will have some problems, because we don't know if "City" or "University" is the correct label.