Summarization with LLMs: inference performance and evaluation insights

In this repo, we walk you through an experiment for a common use case of Large Language Models (LLMs): text summarization.

We compare two strong open source models: Mixtral 8x7B and LLama2 70B.

We consider two comparison axis:

inference performance, when run on NVIDIA GPUs for hardware acceleration, and
task performance, evaluating the generated summaries with a suitable NLP evaluation metric.

You can follow the notebooks in order for a walk-through of the experiments.

atineoSE/SummarizationEval