/graph-dependency-parser

Modular implementation of a graph-based dependency parser in AllenNLP.

Primary LanguagePython

graph-dependency-parser

Modular implementation of a graph-based first-order dependency parser in AllenNLP. Decoding is performed with the Chu-Liu/Edmonds (CLE) algorithm.

There are currently two edge models implemented, the Dozat & Manning 2016 and the Kiperwasser & Goldberg 2016 one. Apart from the edge models, the are also two different loss functions, a softmax log-likelihood and a hinge loss that requires running the CLE algorithm at training time.

Requirements

It is best, to set up a conda environment.

Internal note: use /proj/irtg.shadow/conda/envs/allennlp

Running the code

Three example configurations are provided (.jsonnet files). To see if everything's working and to train a parser (on the English EWT corpus), run:

mkdir -p data/
cd data/
wget https://github.com/UniversalDependencies/UD_English-EWT/raw/master/en_ewt-ud-train.conllu
wget https://github.com/UniversalDependencies/UD_English-EWT/raw/master/en_ewt-ud-dev.conllu

cd ../
mkdir -p models/
bash example_train.sh

echo Evaluating on development set

bash example_evaluate.sh

If you play around with the code in an IDE you might want to use run.py instead of the bash scripts.