svjan5/medtype

The count of docs and sents in PubMedDS

GanjinZero opened this issue · 1 comments

I am using PubmedDS as training corpora for my project.
I notice the count of documents and sentences is inconsistent in arxiv v1 and v2/v3.
Do you add new documents to PubmedDS?

Yes, we made few improvements in the dataset generation code of PubmedDS. Please use the latest dataset.