[Feature request] : Make the scorer directly accessible via a python API
sven-nm opened this issue · 0 comments
It would be great if the scorer could be called via a python API. At time of writing, the scorer can only be fed with HIPE-compliant tsv-files. This is a limitation for two reasons :
- It makes it complicated to evaluate on the fly (e.g. at the end of each epoch).
- It makes it necessary to rebuild words out of each model's tokens, which can be subtokens.
This second point can be very problematic, depending on the your labelling strategy. Before sub-tokenizing, an input example may look like :
O B-PERS I-PERS I-PERS
The Australian Prime minister
A model like BERT could tokenize and label this example like so :
O B-PERS B-PERS I-PERS I-PERS
The Austral ##ian Prime minister
However, at inference time, the model may predict something like :
O B-PERS I-PERS I-PERS I-PERS
The Austral ##ian Prime minister
To evaluate this prediction, you must first rebuild the words to match the ground-truth tsv. However, since austral
and ##ian
have to different labels, it is not clear which should be chosen.
If there was a possibility to feed the scorer with two simple list
objects (prediction and ground-truth, in a seqeval like fashion), things would be easier.
Though the aforementioned problem could be circumvented by labelling only the first sub-token, it would still be great to evaluate predictions on the fly, and even to have the API directly accessible via external frameworks such as HuggingFace.