BC5CDR Performance - Representation
Opened this issue · 0 comments
safranchik commented
Hello!
I am currently working on a NER research project that uses AllenNLP as backend, and one of the datasets we're using to evaluate or model is BC5CDR. We've been previously using ELMo embeddings, and we wish to switch to SciBERT.
However, after browsing data/ner/bc5cdr, I realized that the data does not differentiate between chemicals and diseases. That is, the fourth column describing the entity label does not contain any information pertaining to which of the two it is. Having said this, I would like to know if the BC5CDR 88.94 Test F1 reported here and in the Scibert paper was obtained by treating chemicals and diseases as the same entity.
Thank you!