/longer_text_summary

Summarize longer Estonian texts

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

(Longer) Estonian text summarization

Project to test different transformer-based models for (longer - up to 2048 tokens input) Estonian text summarization.

Models tested:

  • MBart
  • MT5
  • mLongT5

Methods for making models smaller/accepting longer context window:

  • model embedding layers reduction - keeping only tokens which are present in training data
  • using LSG Attention
  • in some cases quantization

Datasets used for experimenting:

Best model trained on Estonian Parliament stenograms summaries is available here

Article explaining what was done: https://ristohinno.medium.com/estonian-longer-text-summarization-8ddbf7f7cd45