ncbi-nlp/bluebert

How did you pre-train the NCBI abstract data exactly ?

zhouyunyun11 opened this issue · 5 comments

In your manuscript, your described like this:
"We initialized BERT with pre-trained BERT provided by (Devlin et al., 2019). We then continue to pre-train the model, using the listed corpora".

Did you use BERT code completely re-train the NCBI abstract corpora? Or used BERT initial model and wordpiece strategy as bioBERT method?

We used BERT initial model and workpiece strategy.

We used the Google default vocab.txt

I am not sure what you meant by "same strategy as Bio_BERT"