Pretrain own bert / roberta model with wwm-mlm and modified tokenizer (both chinese and english).
Primary LanguagePython