Code for our paper: "RefSum: Refactoring Neural Summarization", NAACL 2021.
We present a model, Refactor, which can be used either as a base system or a meta system for text summarization.
python3
conda create --name env --file spec-file.txt
pip3 install -r requirements.txt
main.py
-> training and evaluation proceduremodel.py
-> Refactor modeldata_utils.py
-> dataloaderutils.py
-> utility functionsdemo.py
-> off-the-shelf refactoring
You may specify the hyper-parameters in main.py
.
python main.py --cuda --gpuid [list of gpuid] -l
python main.py --cuda --gpuid [list of gpuid] -l --model_pt [model path]
python main.py --cuda --gpuid [single gpu] -e --model_pt [model path] --model_name [model name]
You may use our model with you own data by running
python demo.py DATA_PATH MODEL_PATH RESULT_PATH
DATA_PATH
is the path of you data, which should be a file of which each line is in json format: {"article": str, "summary": str, "candidates": [str]}
.
RESULT_PATH
is the path of the result of which each line is a candidate summary.
We use four datasets for our experiments.
- CNN/DailyMail -> https://github.com/abisee/cnn-dailymail
- XSum -> https://github.com/EdinburghNLP/XSum
- PubMed -> https://github.com/armancohan/long-summarization
- WikiHow -> https://github.com/mahnazkoupaee/WikiHow-Dataset
You can find the processed data for all of our experiments here. After downloading, you should put the data in ./data
directory.
Dataset | Experiment | Link |
---|---|---|
CNNDM | Pre-train | Download |
BART Reranking | Download | |
GSum Reranking | Download | |
Two-system Combination (System-level) | Download | |
Two-system Combination (Sentence-level) | Download | |
Three-system Combination (System-level) | Download | |
XSum | Pre-train | Download |
PEGASUS Reranking | Download | |
PubMed | Pre-train | Download |
BART Reranking | Download | |
WikiHow | Pre-train | Download |
BART Reranking | Download |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
Refactor | 45.15 | 21.70 | 42.00 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
GSum | 45.93 | 22.30 | 42.68 |
Refactor | 46.18 | 22.36 | 42.91 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
pre-trained Refactor | 44.13 | 20.51 | 40.29 |
Summary-Level Combination | 45.04 | 21.61 | 41.72 |
Sentence-Level Combination | 44.93 | 21.48 | 41.42 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
pre-trained Refactor | 44.13 | 20.51 | 40.29 |
GSum | 45.93 | 22.30 | 42.68 |
Summary-Level Combination | 46.12 | 22.46 | 42.92 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
PEGASUS | 47.12 | 24.46 | 39.04 |
Refactor | 47.45 | 24.55 | 39.41 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 43.42 | 15.32 | 39.21 |
Refactor | 43.72 | 15.41 | 39.51 |
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 41.98 | 18.09 | 40.53 |
Refactor | 42.12 | 18.13 | 40.66 |