/tib-dataset

Dataset for abstractive summarization of long multimodal presentations

TIB dataset for abstractive summarization of long multimodal videoconference records

TIB is an English dataset for abstractive summarization of long multimodal presentations, introduced in the paper TIB: A Dataset for Abstractive Summarization of Long Multimodal Videoconference Records, published at CBMI 2023.

It is a collection of 9,103 videoconference records extracted from the German National Library of Science and Technology (TIB) archive, along with their metadata, an abstract and automatically processed transcripts and key frames.

It is hosted on the Hugging Face dataset hub as the repository gigant/tib and can be used easily using the datasets library.

Relevant links: