Code to train a dependency parser model.
To install dependencies run:
$ conda env create -f environment.yml
And then install the appropriate version of pytorch, for example:
$ conda install pytorch torchvision cpuonly -c pytorch
$ # conda install pytorch==1.0.0 torchvision==0.2.1 cuda80 -c pytorch
Get Universal Dependencies data in [https://universaldependencies.org/#download].
$ make get_ud
First preprocess the data for the language you are using:
$ python src/h01_data/process.py --language <language-code> --glove-file <glove-vectors-filename>
Where language is the ISO 639-1 code for the language, and glove file is the path to a txt file containing one word and its embedding per line. GloVe embeddings for wikipedia can be trained with this repository.
Then, train the model with the command:
$ python src/h02_learn/train.py --language <language-code>
This code will, by default, train a Deep Biaffine Parser.
To train the model using the MST parser loss add the argument --model mst
.
This code, will by default look for data in the ./data
path. To change it (either during data preprocessing or training) use the argument --data-path <data-path>
.