Long Range Transformers

Some variants of transformers are claimed to be able to process long contexts, while most of them were ony tested on pseudo tasks, like LRA, or language modeling, leaving their capability of comprehending long texts to be explored. The goal of this project is to verify the effectiveness of long-range transformers on more practical NLP tasks: Do they really work on NLP tasks concerning with long texts? If not, why, and how can we make it work?

Tasks

Coreference
NLI
Abstractive QA
Extractive QA
Summarization

Datasets

Ontonotes for coref
DocNLI for NLI
Qasper for abstractive QA
Triviaqa for extractive QA
SummFD and CNN

Model

Coarse2fine model for coref (located in this folder)
A baseline model for DocNLI (located in this folder)
A baseline model for abstractive QA (located in this folder)
A baseline model for extractive QA (located in this folder)
A baseline model for summarization (located in this folder)

hiaoxui/long-range-transformers

Long Range Transformers

Tasks

Datasets

Model

Experiments

Coref

2.2 NLI