/sarcasm_detection

Ceng790 Term Project

Primary LanguageScala

Sarcasm Detection using Spark

Mert Tunç, Egemen Berk Galatalı


1.3 million reddit comments that is labeled as sarcastic or not is used as dataset. No coloumns other than comment itself and the label is used. Several methods for preprocessing, feature extraction and ml models are combined to get the best results. Code is written in scala.

Currently, 77% accurcacy is taken with the best combination.