google-research/albert

use custom vocab.txt

2696120622 opened this issue · 1 comments

My corpous consists of pure numbers like 1, 2, ..., 1000000, ..., 1002342, ....
It is differen from words in any language.
Can I replace the vocab.txt with my own vocab.tx created using my corpous for fine-tuning albert?
Or, should I train albert on my corpous from scratch?

Thanks.

Where can I get the vocab.txt ?
Thanks
I am finding the official vocab.txt