
Code for "Understanding Neural Abstractive Summarization Models via Uncertainty" (EMNLP20)

Primary LanguagePythonMIT LicenseMIT

Text Sum Uncertainty

Code for "Understanding Neural Abstractive Summarization Models via Uncertainty" (EMNLP20, short)

ArXiv preprint available at here.

Author: Jiacheng Xu, Shrey Desai, Greg Durrett from TAUR Lab, UT Austin

Contact: jcxu at utexas.edu


In this work,

  • We analyze summarization decoders by studying on the entropy, or uncertainty, of the model's token-level predictions.
  • Models examined: PEGASUS(paper, model) and BART(paper,model)
  • Datasets covered: CNN/DM and XSum
  • Quick start with models directly from huggingface.co/transformers

With the help of the methods we developed, we further investigate

  • Correlation between prediction entropy & model behavior like COPY or GEN (Sec. 3)
  • Sentence position connects to prediction entropy (Sec. 3)
  • Model behavior in different syntactic environments (Sec. 4)
  • Coarse properties of attention and the how that correlates with model's prediction (Sec. 5)


Hyper Parameters

In util.py, the function parse_arg defines all of the hyper-params used in this project.

Param Usage
prob_meta_dir The location you save the model outputs.
max_len Max decoding length. Set to 30 for XSum and 80 for CNN/DM.
device Device name for Pytorch.
nuc_prob Nucleus sampling prob threshold. Default: 0.95.
trunc_prob Truncate the probability distribution (by default used in all of our experiments).
full_prob Use the original probability distribution.

Existing Configuration

To run the model, simply run python run_model_pegasus.py with one of the following parameter configuration.

Config Name Parameters
run_model_pegasus_cnndm --full_data
run_model_pegasus_xsum --full_data --model_name google/pegasus-xsum --data_name xsum
run_model_bart_cnndm --full_data --model_name facebook/bart-large-cnn
run_model_bart_xsum --full_data --model_name facebook/bart-large-xsum --data_name xsum

class SumGen in run_model_pegasus.py is the core decoding part.