/SummarizationEval

An experiment to evaluate text summarizations with open source LLMs

Primary LanguageJupyter Notebook

Summarization with LLMs: inference performance and evaluation insights

In this repo, we walk you through an experiment for a common use case of Large Language Models (LLMs): text summarization.

We compare two strong open source models: Mixtral 8x7B and LLama2 70B.

We consider two comparison axis:

  • inference performance, when run on NVIDIA GPUs for hardware acceleration, and
  • task performance, evaluating the generated summaries with a suitable NLP evaluation metric.

You can follow the notebooks in order for a walk-through of the experiments.