On based of Chinese MLT Pretrained model Randeng-MLT
, we finetune it on a Chinese Multitask medical Seq2Seq dataset, PromptCBLUE.
Also added verbaliser for better and faster model convergence.
- Collect PromptCBLUE data under
aug_data
folder, then run
python data_utils/verbaliser.py
- Start training!
bash scripts/train.sh
Pretrained model: IDEA-CCNL/Randeng-T5-784M-MultiTask-Chinese [https://huggingface.co/IDEA-CCNL/Randeng-T5-784M-MultiTask-Chinese]