An implementation of QANet with PyTorch.
Now it can reach EM/F1 = 70.5/77.2 after 20 epoches for about 20 hours on one 1080Ti card.
This repo is under re-implementation.
Python 3.6 & PyTorch 0.4
- Install pytorch 0.4 for Python 3.6+
- Run
pip install spacy tqdm ujson requests
- Run
python -m spacy download en
- Run
python main.py
dataset.py: download dataset and parse.
main.py: program entry.
models.py: QANet structure.
- The paper doesn't mention which activation function they used. I use relu.
- I don't set the embedding of
<UNK>
trainable. - The connector between embedding layers and embedding encoders may be different from the implementation of Google, since the description in the paper is inconsistent (residual block can't be used because the dimensions of input and output are different) and they don't say how they implement it.
- Max passage length is 300 instead of 400 since I don't have much GPU memory.
- Reduce memory usage
- Performance analysis
- Reach state-of-art scroes of the original paper
- Ablation analysis