/cliff_summ

Code for EMNLP 2021 paper "CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization"

Primary LanguagePythonApache License 2.0Apache-2.0

CLIFF

Code for EMNLP 2021 paper "CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in Abstractive Summarization"


News

  • Codes for using unlikelihood training and in-batch negatives are added. Please check train_xsum_batch_neg.sh and train_xsum_single_neg_ull.sh. Related Fairseq codes are here: unlikelihood_translation.py and contrastive_translation_batch_neg.py.
  • A cleaner implementation is available. The new implementation uses less system RAM and is compatible with the current version of Fairseq. Check here.
  • We find that the newer version of QuestEval produces much lower scores than the version (commit 0e94a74) we used in our paper. Please do not directly take the QuestEval results from the paper if you are using the newer version.

Data Construction

For data construction, please refer to data_construction. Constructed datasets are also available in Google Drive.


Training

The following scripts require that your $DATA folder is organized the same as the data folder in Google Drive.

BART

Our experiments with BART use Fairseq at commit 0db28cd. Newer versions might also work. Please download the pre-trained BART model here and set BART_PATH to the downloaded model:

export BART_PATH=/path/to/bart/model.pt
Single Negative Strategy

The following command trains the models with negative samples constructed by SysLowCon. It saves the trained models in $TRAINED_MODELS/xsum/syslowcon and $TRAINED_MODELS/cnndm/syslowcon. Please change $DATA/xsum_synthetic/negative_syslowcon to other negative samples to train the corresponding models.

# XSum
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_single_neg.sh \
  $DATA/xsum_synthetic/negative_syslowcon $TRAINED_MODELS/bart_xsum/syslowcon

# CNN/DM
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_single_neg.sh \
  $DATA/cnndm_synthetic/negative_syslowcon $TRAINED_MODELS/bart_cnndm/syslowcon
Multiple Negative Strategies

The following command trains the models with negative samples constructed by SysLowCon and SwapEnt. It saves the trained models in $TRAINED_MODELS/xsum/syslowcon_swapent and $TRAINED_MODELS/cnndm/syslowcon_swapent.

# XSum
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_mutli_neg.sh \
  "$DATA/xsum_synthetic/negative_syslowcon $DATA/xsum_synthetic/negative_swapent" \
  $TRAINED_MODELS/bart_xsum/syslowcon_swapent

# CNN/DM
cd scripts/bart
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_multi_neg.sh \
  "$DATA/cnndm_synthetic/negative_syslowcon $DATA/cnndm_synthetic/negative_swapent" \
  $TRAINED_MODELS/bart_cnndm/syslowcon_swapent

Pegasus

Our experiments with Pegasus use Huggingface Transformers 4.5.1. Newer versions might also work.

Single Negative Strategy
# XSum
cd scripts/pegasus
CUDA_VISIBLE_DEVICES=0,1 ./train_xsum_single_neg.sh \
  $DATA/xsum_synthetic/negative_syslowcon $TRAINED_MODELS/pegasus_xsum/syslowcon
  
# CNN/DM
cd scripts/pegasus
CUDA_VISIBLE_DEVICES=0,1 ./train_cnndm_single_neg.sh \
  $DATA/cnndm_synthetic/negative_syslowcon $TRAINED_MODELS/pegasus_cnndm/syslowcon

Decoding

The following examples show how to decode trained models. Model checkpoints are available in Google Drive.

BART

# XSum
cd scripts/bart
./decode_xsum.sh $TRAINED_MODELS/bart_xsum/syslowcon/checkpoint_last.pt /path/to/save/dir

# CNN/DM
cd scripts/bart
./decode_cnndm.sh $TRAINED_MODELS/bart_cnndm/syslowcon/checkpoint_last.pt /path/to/save/dir

Pegasus

# XSum
cd scripts/pegasus
python run_generation.py $DATA/xsum_raw/test.source $TRAINED_MODELS/pegasus_xsum/syslowcon /path/to/save/dir

# CNN/DM
cd scripts/pegasus
python run_generation.py $DATA/cnndm_raw/test.source $TRAINED_MODELS/pegasus_cnndm/syslowcon /path/to/save/dir