To setup and install dependencies, clone or download this repository into your local machine. Create and activate conda environment for squad baseline with the following command.
conda env create -f environment.yml
conda activate squad_baseline
Next, download the file data.zip containing the processed dataset, embedding vectors and GloVE pre-trained word vectors glove.840d.300d.txt from the following link:
https://drive.google.com/open?id=1cPpDSnXKm-Grh7yW8nU3EQq1skfVBBnm
Unzip the file and place the entire /data folder in the same folder as the script files. If you setup correctly, the scripts and data shoud be in the following structure:
- args.py
- train.py
- models.py
- layers.py
- util.py
- /data/char_emb.json
- /data/char2idx.json
- /data/word_emb.json
- /data/word2idx.json
- /data/dev.npz
- /data/train.npz
- /data/test.npz
- /data/glove.840B.300d/glove.840B.300d.txt
If all the files are present, you're good to go.
python train.py --name <label>
Example:
python train.py --name baseline_char_embed
python train.py --name <label> --use_char_emb 0
Example:
python train.py --name baseline_without_char_embed --use_char_emb 0
This code package is adapted from SQuAD baseline by Chris Chute
https://github.com/chrischute/squad.git
Scripts available here:
args.py
: available argumentstrain.py
: main training scriptmodels.py
: BiDAF modellayers.py
: implementation of different layers in the modelutil.py
: utilities script