/spam_detection

Early-stage NLP project. NLP for spam detection and a simple educational game to follow.

Primary LanguageJupyter Notebook

Spam Detection NLP & Educational Game (In Progress)

Update: 1/7/20 - I am in the process of learning how to host this project as an interactive game using Heroku and Django. Once I've learned more about that, you can expect to see more things pushed here.

This repository contains an unfinished NLP project that deals with: First, binary classification for spam detection; Second, recontextualizing the trained model as the "judge" in a game.

The general idea is that users compete to get the most spam-like response past the model's predictions without actually getting flagged as spam. This game is a tool in two senses: It operates educationally to show what words and what sorts of words are more "loaded" in this context, and it shows the data scientist behind the model (in this case me--hi!) potential shortcomings or blindspots in their model.

Right now, the data is sourced from this Kaggle set but I intend on trying more datasets as this project moves forward.

Reach out to me on Twitter @zych_steven for updates and feedback!