/SAR14

A rating-based sentiment dataset of IMDB movie reviews (WASSA 2014)

MIT LicenseMIT

SAR14: A rating-based sentiment dataset of movie reviews

The SAR14 dataset contains 234k IMDB movie reviews along with their associated rating scores on a 1-10 scale. Particularly, this dataset consists of 167k reviews with positive scores (greater than or equal to 7) and 66k reviews with negative scores (less than or equal to 4). Please find details about the construction of this dataset as well as results of sentiment polarity classification in our paper:

@InProceedings{NguyenWASSA2014long,
  author    = {Dai Quoc Nguyen and Dat Quoc Nguyen and Thanh Vu and Son Bao Pham},
  title     = {Sentiment Classification on Polarity Reviews: An Empirical Study Using Rating-based Features},
  booktitle = {Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis},
  year      = {2014},
  pages     = {128--135}
}

Please cite the paper whenever SAR14 is used to produce published results or incorporated into other software. SAR14 is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.