This project includes several BERT-Based Temporal relation classifiers for BCCWJ-Timebank[Asahara 2013].
- Pair-wised model
- Multi-task learning model
- Source event-centric model
I've placed corpus in the data folder.
- Please install Jupyter notebook first.
- Replace MODEL_URL and excel_file with the real paths in
temp_prediction.ipynb
MODEL_URL = "checkpoints/ALL_20200830171210/cv0/"
excel_file = "行動データ3言語54subs (002).xlsx"
After finishing running all the cells, a \*_tagged.xlsx
will be generated.
Pair-wise models perform 'DCT', 'T2E', 'E2E', 'MAT' as independent classifiers.
[document-level 5-fold cross-validation]
Train:
python multiTaskClassifier.py \
--task 'DCT' \ # please input 'DCT', 'T2E', 'E2E' or 'MAT'
--pre $BERT_DIR \ # pre-trained BERT dir
--model_dir $MODEL_DIR \ # new dir to save model
--batch 16 \
--epoch 5 \ # fine-tuning epochs e.g. 3~7
--do_train # training model
~ print(test) ~
Test:
python multiTaskClassifier.py \
--task 'DCT' \ # please input 'DCT', 'T2E', 'E2E' or 'MAT'
--model_dir $MODEL_DIR \ # model dir to load
Pair-wise models jointly training 'DCT', 'T2E', 'E2E', 'MAT' batchs in one classifier.
[document-level 5-fold cross-validation]
Train:
python multiTaskClassifier.py \
--task 'ALL' \ # The model loads 'DCT', 'T2E', 'E2E', 'MAT' batchs in one sequence
--pre $BERT_DIR \ # pre-trained BERT dir
--model_dir $MODEL_DIR \ # new dir to save model
--batch 16 \
--epoch 5 \ # fine-tuning epochs e.g. 3~7
--do_train # training model
Test:
python multiTaskClassifier.py \
--task 'ALL' \
--model_dir $MODEL_DIR \ # model dir to load
This is also a joint model to train 'DCT', 'T2E', 'E2E', 'MAT'.I've not create a API for this model, but you can replace 'PRETRAIN_BERT_DIR' with your pre-trained BERT model in 'eventCentreClassifier.py', then run it. Currently it use one data split, not 5-fold CV.
Run:
Python eventCentreClassifier.py
TBD
pytorch=1.0.0
pytorch-pretrained-bert=0.6.2
mojimoji
tqdm
pyknp
jumanpp
Temporal Relation Classification:
BCCWJ-Timebank:
BERT:
Multi-Task Learning: