A tool for ancient Chinese segmentation.
The tool is a Deep Learning model and it is based on BERT (Devlin, 2018). We train the model on a huge collection of ancient Chinese poem, prose, fiction and other types of literature. All of these are punctuated.
On our test set, the tool achieves an impressive results where the recall and the precision are both over 90%.
We release it on 古詩文斷句.
Shen Li (shen@mail.bnu.edu.cn)
Yuchen Zhu (zhuyuchen81@gmail.com)
Renfen Hu (irishere@mail.bnu.edu.cn)