JafferWilson/Process-Data-of-CNN-DailyMail

Summary section

suchanun opened this issue · 2 comments

Hi, for cnn stories tokenized dataset, when I print out texts in .story files, each sentence with the "@ highlight" before it is the summary right? Also the summary is not extracted from the texts but is somehow human/machine -generated?

Thank you so much for any reply!

Obviously,it is machine generated. This repository was created for a support purpose for the repository: https://github.com/abisee/cnn-dailymail
As many people were unable to work with it because of the version issues. Hence, I have tokenized them and kept here so that people can use it as it is.

ok thank you!