/reading-subtext

Short story summarization evaluation paperr.

Primary LanguagePythonOtherNOASSERTION

Evaluating LLM Short Story Summarization with Writers

Code and data for this paper: Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers

Annotations of faithfulness errors along with the errors themselves can be found under 'error_annotations'.

Scripts to have models generate summaries, score summaries, and label faithfulness errors are in 'model_scripts'.

The interface for writers to evaluate summaries is in 'streamlit_interface'.

The writer assigned scores and feedback are in 'writer_ratings_comments.tsv'.