RRisto/longer_text_summary

Summarize longer Estonian texts

Jupyter NotebookApache-2.0

(Longer) Estonian text summarization

Project to test different transformer-based models for (longer - up to 2048 tokens input) Estonian text summarization.

Models tested:

MBart
MT5
mLongT5

Methods for making models smaller/accepting longer context window:

model embedding layers reduction - keeping only tokens which are present in training data
using LSG Attention
in some cases quantization

Datasets used for experimenting:

Best model trained on Estonian Parliament stenograms summaries is available here

Article explaining what was done: https://ristohinno.medium.com/estonian-longer-text-summarization-8ddbf7f7cd45