Repository for AKI-BERT: a Pre-trained Clinical Language Model for Early Prediction of Acute Kidney Injury (HealthNLP 2020)
The AKI-BERT models can also be downloaded here,
AKI-BC-BERT
, AKI-BioBERT
and AKI-baseBERT
are pre-trained from Bio+Clinical BERT, BioBERT, BERT-base-uncased, respectively.
To reproduce the steps necessary to finetune Bio+Clinical BERT or BioBERT on AKI data, follow the following steps:
- Run
format_for_bert.py
- Note you'll need to change the file paths at the top of the file. - Run
AKI_bert_finetuning.sh
- Note you'll need to change the TRAIN_FILE as the txt file generated last step; change INITIAL_PATH to the model path from which your want to start the pretraining; change OUT_PATH to the path you want to store the model.
Assuming your model path is './AKI_bert_simple_lm_notes/', run
python3 aki_bert.py --gpu 0 --bert_model './AKI_bert_simple_lm_notes/' --do_train --num_train_epochs 5 --dropout 0 --pooling max --sampling SBS --max_seq_per_doc 180 --max_seq_length 32
Please acknowledge the following work in papers or derivative software:
@inproceedings{mao2020pre,
title={A pre-trained clinical language model for acute kidney injury},
author={Mao, Chengsheng and Yao, Liang and Luo, Yuan},
booktitle={2020 IEEE International Conference on Healthcare Informatics (ICHI)},
pages={1--2},
year={2020},
organization={IEEE}
}