Code for paper "Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification".
We run our experiments on the following four datasets: AG-News, IMDB, Yorùbá, and Hausa.
Name | # train | # validation | # test | data source |
---|---|---|---|---|
AGNews | 108000 | 12000 | 7600 | LINK |
IMDB | 21246 | 3754 | 2500 | LINK |
Yorùbá | 1340 | 189 | 379 | LINK |
Hausa | 2045 | 290 | 582 | LINK |
Create a data
directory as the home directory for all datasets, then create a folder for each dataset in the data
directory.
For each dataset, create the following files:
[train/validation/test].txt
: one document per line.[train/validation/test]_labels.pickle
files: list of labels (we assume that labels are already encoded in label ids, e.g., [0,0,1,2,3,...,4]).- If there is no validation data for a given dataset, then there is no need to create the corresponding
validation
files. - Save files in a
txt_data
directory inside the dataset folder. For example, the full path to thetrain.txt
file of the AGNews dataset isdata/AGNews/txt_data/train.txt
.
Run BERT-WN
on AG-News, with 20% single-flip label noise:
CUDA_VISIBLE_DEVICES=[CUDA_ID] python3 ../main.py \
--dataset AG-NEWs \
--log_root [LOG_ROOT] \
--data_root [DATA_ROOT] \
--trainer_name bert_wn \
--model_name bert-base-uncased \
--gen_val \
--nl_batch_size 32 \
--eval_batch_size 32 \
--gradient_accumulation_steps 1 \
--max_sen_len 64 \
--lr 0.00002 \
--num_training_steps 3000 \
--patience 25 \
--eval_freq 50 \
--store_model $STORE_MODEL \
--noise_level 0.2 \
--noise_type sflip \
--manualSeed 1234