CWSWK
The source code for paper chinese word segmentation with world knowledge
How to
-
Download the bert model BERT to folder data/bert/ if you want to train mode CWSB or CWSBD
-
Preprocess data(will save the train, val, test dataset under folder data)
python preprocess.py
- Train and save model CWSB for dataset pku
python train.py -m CWSD -ds pku -save
- Debug:
python train.py -m CWSD -ds pku -d
- Predict using saved model on epoch 2:
python train.py -m pred_model -ms CWSD -ds pku -s 2
Please cite the paper: