Abstractive Text Summarization using Encoder-Decoder Architecture

Task

Given a document, the model would generate an abstarct summary while retaining the context of the document.

We use Amazon Fine Food Reviews dataset data_link, which has columns for food reviews and their corresponding summary for around 500000 data points.

Encoder-Decoder with Bahdanau Attention : We have referred and adapted the code for the baseline model from https://bastings.github.io/annotated_encoder_decoder/ (Baseline folder)

Self Attention: To capture the dependencies between various encoder states, we introduce Self-Attention on top of Encoder outputs.
Attention on Decoder: To generate contextually better sequences, we introduce Bahdanau attention on already generated tokens, while predicting the next token.
Multihead Attention: Taking inspiration from Transformers which uses Multihead-attention, we introduce multiple Bahdanau attentions on Encoder states.
Variation to Multihead Attention: We introduce a variation to multihead attention by combining one Bahdanau attention model and one Luong attention model.
Metric on Cosine Similarity: Summary is subjective and Bleu score which does exact match, doesn’t do justice. We introduce an equivalent which computes cosine similarity between embeddings of output and target tokens instead.

Please refer to the report and presentation for more details.

Each other folder (apart from Baseline) contains a variant architecture from above.
Training.ipynb contains common training code for all the variants
utils.py contains Helper functions used by all the variants
Please copy Training.py and utils.py to the folder of the intented variant before running the code
Please download the data from data_link and move to folder of intented variant

Model	BLEU Score	Our Metric
BASELINE	22.86	0.482
MULTIHEAD ATTENTION	25.41	0.516

The following output sequences are generated for an input sentence (109 words) of a customer review on coffee.

Target: ”nice stuff”

Baseline output: ”dreamfields addition for the morning”

Self-Attention output: ”my little break to brew”