This is a simple implementation of Multi-Passage BERT(not the same, but similar).
Tested with DuReaderV2 dataset. Using the squad evaluation script, we got:
"AVERAGE": "20.993"
"F1": "30.572"
"EM": "11.414"
Emm, not good.
Chinese Version:
A simple demo can be found here: AiSK
A brief intro of openqa can be found in my blog: OpenQA
- tensorflow-gpu == 1.15 or 1.14
- tqdm
- horovod(optional)
Since we have 5 documents in one training example, we can't set the batch size too large. We use:
- Mix-precision training to speed up the training
- Gradient accumulation to make the batch size larger
- Distribute training, actually we only use one server with 2 2080Ti GPUs.
If you have only one GPU, mix-precision + gradient accumulation works
Note: If you want to train with distribute training, you should install horovod, the best way to get horovod is to use the Nvidia docker, we use the one with tag 19.10-py3.
-
download Dureader dataset, and unzip
-
run preprocess/preprocess.sh script to preprocess the dataset
-
run preprocess/convert_dureader_to_squad.py to convert to dataset to squad-like dataset
-
run run_mpb.sh to train the model
-
run run_predict.sh to predict with the model
-
run squad_evalute.py to get the evaluation results
Actually in our real project, we don't use multi-passage bert. We choose to use one MRC model + one answers ranker model, because we can train and optimize these two models separately.
This code is used for practicing. I don't have enough time to test or improve it. Some codes are copied from my jupyter notebook, maybe you need to fix some errors to run the codeš.
- ACL2018-DuReader
- Data preprocess
- Dureader-Bert
- Data preprocess, predict
- CMRC2018-Baselines
- training
- NVIDIA-BERT
- Distribute training
- AMP
- Gradient Accumulation
- OpenQA
- Model part