/BERT-chinese

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 中文 汉语

Primary LanguagePythonApache License 2.0Apache-2.0

BERT-chinese

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 中文 汉语

requirements

python3

tensorflow >= 1.10

jieba

使用方法

1, 准备数据,参考data文件夹和vocab文件夹,data里空行代表document的分隔

2, 数据处理成tfrecord create_pretraining_data.py

3, 预训练 run_pretraining.py

TODO

实验结果

TODO

TODO LIST

GPU并行训练

License

MIT.