-
Make sure you have Miniconda installed
- Conda is a package manager that sandboxes your project’s dependencies in a virtual environment
- Miniconda contains Conda and its dependencies with no extra packages by default (as opposed to Anaconda, which installs some extra packages)
-
cd into src, run
conda env create -f environment.yml
- This creates a Conda environment called
squad
- This creates a Conda environment called
-
Run
conda activate squad
- This activates the
squad
environment - Do this each time you want to write/test your code
- This activates the
-
Run
python setup.py
- This downloads SQuAD 2.0 training and dev sets, as well as the GloVe 300-dimensional word vectors (840B)
- This also pre-processes the dataset for efficient data loading
- For a MacBook Pro on the Stanford network,
setup.py
takes around 30 minutes total
-
Browse the code in
train.py
- The
train.py
script is the entry point for training a model. It reads command-line arguments, loads the SQuAD dataset, and trains a model. - You may find it helpful to browse the arguments provided by the starter code. Either look directly at the
parser.add_argument
lines in the source code, or runpython train.py -h
.
- The
jerrylzy/SQuAD-QANet
We implemented QANet from scratch and improved baseline BiDAF. We also used an ensemble of BiDAF and QANet models to achieve EM/F1 of 69.47/71.96, ranking #3 on the leaderboard as of Mar 4, 2022.
Jupyter NotebookMIT