MRQA test suite on PyTorch Lightning.
Easily run the MRQA test suite with any *ForQuestionAnswering
model from transformers
.
pytorch-lightning
will manage all the hardware resources, allowing you to run on CPU, GPU,
Multi GPU and Multi Node - Multi GPU without changing a single line of code.
- Updated to use latest versions of
pytorch-lightning
andtransformers
. - Works with every
*ForQuestionAnswering
model in thetransformers
library. - Much faster datasets preparation using multiprocessing.
- Multi GPU training, checkpointing, logging and much more thanks to
pytorch-lightning
. datasets
library really reduces memory usage.- Works for every model, and not just with
BERT
as in the original project
Start by installing the requirements:
pip install -r requirements.txt
The training/dev/test datasets are automatically downloaded. You can choose training and validation sets among:
NewsQA
NaturalQuestionsShort
TriviaQA-web
SearchQA
HotpotQA
SQuAD
and test sets among:
RACE
DuoRC.ParaphraseRC
BioASQ
TextbookQA
RelationExtraction
DROP
Now run your fine-tuning on HotPotQA:
python main.py \
--accelerator gpu \
--pre_trained_model roberta-base \
--name roberta-base-finetuned-hotpotqa \
--dataset mrqa \
--train_split train --train_subsets HotpotQA SearchQA \
--val_split validation --val_subsets HotpotQA \
--test_split test --test_subsets DROP RACE \
--batch_size 16 \
--val_batch_size 32 \
--test_batch_size 32 \
--accumulate_grad_batches 2 \
--learning_rate 1e-5 \
--max_epochs 4 \
--max_sequence_length 512 \
--doc_stride 128 \
--val_check_interval 0.2 \
--output_dir outputs/mrqa-training/hotpot_search \
--num_warmup_steps 100 \
--monitor validation/f1 \
--patience 5 \
--early_stopping True \
--num_workers 8 \
If you wans to use many (say 8) GPUs, set --accelerator gpu --strategy ddp --devices 8
.
Please refer to pytorch-lighting doc for the training hyperparameters.
Another example with multiple GPUs and many train files:
python main.py \
--accelerator gpu --devices 8 --strategy ddp \
--pre_trained_model roberta-base \
--name roberta-base-finetuned-hotpotqa \
--dataset mrqa \
--train_split train --train_subsets HotpotQA \
--val_split validation --val_subsets HotpotQA \
--test_split test --test_subsets DROP \
--batch_size 32 \
--val_batch_size 32 \
--test_batch_size 32 \
--accumulate_grad_batches 2 \
--learning_rate 1e-5 \
--max_epochs 4 \
--max_sequence_length 512 \
--doc_stride 128 \
--val_check_interval 0.2 \
--output_dir outputs/mrqa-training/hotpotqa \
--num_warmup_steps 100 \
--monitor validation/f1 \
--patience 5 \
--early_stopping True \
--num_workers 8 \
Get all the available hyperparameters with:
python main.py --help
batch_size
is per single device (GPU, TPU, ...)accumulate_grad_batches
enables you to use larger batch sizes when memory is a contraintmax_steps
set the max number of steps instead ofmax_epochs
- use only
test_subsets
if you want to test only a model
- Delete cache folder and let the script create datasets again. The default cache folder is
~/.cache/mrqa-lightning
.
Copyright (c) 2019, Facebook, Inc. and its affiliates. All Rights Reserved. Project adapted by Luca Di Liello from https://github.com/facebookresearch/SpanBERT/blob/0670d8b6a38f6714b85ea7a033f16bd8cc162676/code/run_mrqa.py