Input data format for Named Entity Recognition
CCYChongyanChen opened this issue · 1 comments
CCYChongyanChen commented
Hi, thank you for sharing the code!
I am trying to run Named Entity Recognition task but I didn't find the "train.tsv" or "devel.tsv" in the BC5CDR dataset. Instead, the train/devel/test data are in ".txt" format. If I change the '.txt' directly to ".tsv" and run, it shows keyerror:'clonidine.'
Could you tell me what exactly input format NER task needs? It will be greater if you can share the preprocessing code given title and abstract.
Thank you in advance
yfpeng commented
You need to use the bert version at https://github.com/ncbi-nlp/BLUE_Benchmark/releases/download/0.1/bert_data.zip