ncbi-nlp/bluebert

Input data format for Named Entity Recognition

CCYChongyanChen opened this issue · 1 comments

Hi, thank you for sharing the code!
I am trying to run Named Entity Recognition task but I didn't find the "train.tsv" or "devel.tsv" in the BC5CDR dataset. Instead, the train/devel/test data are in ".txt" format. If I change the '.txt' directly to ".tsv" and run, it shows keyerror:'clonidine.'

Could you tell me what exactly input format NER task needs? It will be greater if you can share the preprocessing code given title and abstract.
Thank you in advance