OpenMOSS/MOSS

How to train a custom tokenizer for Chinese from scratch

SparkJiao opened this issue · 0 comments

Hi, wonderful work!

May I know how to train a custom tokenizer for Chinese from scratch? Is there any public reference or code can share?

Thanks for your help very much!

best,
Fangkai