Unintended-Bias-in-Toxicity-Classification

The repo contains notebooks for the Jigsaw Unintend Bias in Toxicity Classification contest hosted on Kaggle

Kaggle : https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification
Discord Bot Based on this : https://github.com/ToxicBot-Discord/ToxicBot

Score 1

Score 2

Methodology

Both the attempts use GloVe embeddings. However Attempt2 uses a custom loss function but the AUC-ROC score was less than Attempt1. Also BERT embedding mostly would have givven a superior result, however it wasn't possible to run it on Google Colab as the ram limit exceeded everytime. An alternative was to train on only 40% of the dataset using BERT but we could lose some valuable information.

ToxicBot-Discord/Unintended-Bias-in-Toxicity-Classification

Unintended-Bias-in-Toxicity-Classification

Score 1

Score 2

Methodology