Multi-task Learning Based Neural Bridging Reference Resolution

Introduction

This repository contains code introduced in the following paper:

Multi-task Learning Based Neural Bridging Reference Resolution
Juntao Yu and Massimo Poesio In Proceedings of he 28th International Conference on Computational Linguistics (COLING), 2020

The code is written in Python 2, the compatibility to Python 3 is not guaranteed.
Before starting, you need to install all the required packages listed in the requirment.txt using pip install -r requirements.txt.
After that modify and run extract_bert_features/extract_bert_features.sh to compute the BERT embeddings for your training or testing.
You also need to download context-independent word embeddings such as fasttext or GloVe embeddings that required by the system.

Pre-trained models can be download from this link. We provide pre-trained models for ARRAU RST reported in our paper, if you need other models please contact me.
Choose the model you want to use and copy them to the logs/ folder.
Modifiy the test_path accordingly in the experiments.conf:
- the test_path is the path to .jsonlines file, each line of the .jsonlines file is a batch of sentences and must in the following format:
```
{
"clusters": [[[0,0],[5,5]],[[2,3],[7,8]], #Coreference
"bridging_pairs"[[[14,15],[2,3]],....] #Bridging 
"doc_key": "nw",
"sentences": [["John", "has", "a", "car", "."], ["He", "washed", "the", "car", "yesteday","."],["How","is","the", "left", "wheel","?"]],
"speakers": [["sp1", "sp1", "sp1", "sp1", "sp1"], ["sp1", "sp1", "sp1", "sp1", "sp1","sp1"],["sp2","sp2","sp2","sp2","sp2","sp2","sp2"]] #Optional
}
```
- For coreference the mentions only contain two properties [start_index, end_index] the indices are counted in document level and both inclusive.
- For bridging pairs, each pair contains two mentions the first one is the anaphora and the second one is the antecedent.
Then use python evaluate.py config_name to start your evaluation

You will need additionally to create the character vocabulary by using python get_char_vocab.py train.jsonlines dev.jsonlines
Then you can start training by using python train.py config_name