/reddit_nlp_classification

Example of using NLP to classify comments into sub-Reddits

Primary LanguageJupyter Notebook

NLP comment classification on Reddit

Shawn Mitchell MBA Msc

Executive Summary

Using NLP techniques, we can create a highly accurate classification model that will automatically differentiate between categories for Reddit comments. Misclassified comments are usually ones that a human with previous knowledge of the two sub-Reddits would not be able to classify, when only looking at the comment text and not the previous conversation they were part of.

The model will work any two sub-Reddits, the default used here were for the two games Stellaris and Dwarf Fortress.

Data Science Problem

Are the comments between two given sub-Reddits significantly different enough for a machine learning model to categorize the comments into their respective groups?

Links to Reddits Used

Stellaris Dwarf Fortress