Source code of "Headed Span-Based Projective Dependency Parsing" and "Combining (second-order) graph-based and headed span-based projective dependency parsing"
prepare environment
conda create -n parsing python=3.7
conda activate parsing
while read requirement; do pip install $requirement; done < requirements.txt
prepare dataset:
you can download the datasets I used from link.
python train.py +exp=base datamodule=a model=b seed=0
a={ptb, ctb, ud2.2}
b={biaffine, biaffine2o, span, span1o, span1oheadsplit, span2oheadsplit}
multirun example:
python train.py +exp=base datamodule.ud2.2 model=b datamodule.ud_lan=de,it,en,ca,cs,es,fr,no,ru,es,nl,bg seed=0,1,2 --mutlirun
For UD, you also need to prepare the JAVA environment for the use of MaltParser.
- Clean code (e.g. add comments)
- Add eval.py, now we only support training from scratch.
- Release pre-trained model.
Please let me know if there are any bugs. Also, feel free to contact bestsonta@gmail.com if you have any questions.
@misc{yang2021headed,
title={Headed Span-Based Projective Dependency Parsing},
author={Songlin Yang and Kewei Tu},
year={2021},
eprint={2108.04750},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{yang2021combining,
title={Combining (second-order) graph-based and headed span-based projective dependency parsing},
author={Songlin Yang and Kewei Tu},
year={2021},
eprint={2108.05838},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
The code is based on lightning+hydra template. I use FastNLP as dataloader. I use lots of built-in modules (LSTMs, Biaffines, Triaffines, Dropout Layers, etc) from Supar.