allenai/scibert

SciVocab Preparation

Opened this issue · 1 comments

Can you please elaborate a bit more how you used Sentencepiece to build SciVocab?

It is just one function call: https://github.com/allenai/scibert/blob/master/scripts/cheatsheet.txt#L6
The output format is slightly different than what BERT expects, so we manually fixed after it was generated.