- python==3.6.2
- torch==1.9.1+cu111
- torchvision==0.10.1+cu111
- tensorboardX==2.4
- transformers==2.9.0
- sklearn
Datasets are available on the GLUE benchmark website (https://gluebenchmark.com/tasks), and pre-trained models are available on Huggingface (https://huggingface.co/models). In the current version, the supported pre-trained models are BERT/RoBERTa series models.
An example of fine-tuning BERT with AD-DROP.
- Preprocess the datasets:
python data_process.py --bert_path='bert-base-uncased' --model='BERT'
- (Optional) Fine-tune a base model with the original fine-tuning approach:
python main.py --option='train' --model='BERT' --bert_path='bert-base-uncased'
or
>> bash run_ft.sh
- Fine-tune a model with our AD-DROP:
python main.py --option='train' --do_mask --attribution='GA' --p_rate=0.3 --q_rate=0.3 --mask_layers='0' --moedl='BERT' --bert_path='bert-base-uncased'
- Set different dropping strategies via the parameter --attribution (options including ['GA', 'AA', 'IGA', 'RD'].).
- Set the parameter --mask_layers='0,1,2,3' to apply AD-DROP in multiple layers.
- We provide a script for searching for the best settings of 'p_rate' and 'q_rate':
>> bash run_addrop.sh
- The script will save all log files automatically. We provide 'log2excel.py' to collect the best settings.
- Test the fine-tuned model on the test set:
python main.py --option='test' --main_cuda='cpu' --model_path='model/finetuned_BERT_AD.pth'
- It will generate a '.tsv' file for evaluation on the GLUE leaderboard.
- python==3.6.2
- torch==1.9.1+cu111
- transformers==4.7.0
- sacrebleu==2.2.0
The supported pre-trained models are ELECTRA and OPUS-MT. We perform the two tasks by following the official colab. Please refer to HuggingFace Token Classification and Translation for details.
@inproceedings{
yang2022addrop,
title={{AD}-{DROP}: Attribution-Driven Dropout for Robust Language Model Fine-Tuning},
author={Tao Yang and Jinghao Deng and Xiaojun Quan and Qifan Wang and Shaoliang Nie},
booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
year={2022}
}