akuritsyn/kaggle-jigsaw

Kaggle: Jigsaw Unintended Bias in Toxicity Classification - Top 1% (17/3165)

Jupyter Notebook

Jigsaw Unintended Bias in Toxicity Classification

Top 1% solution (17/3165) by AlexeyK, Kirill Kravtsov, Konstantin Gavrilchik and Pavel Pleskov

Built a ML model using an ensemble of BERT (large and small) and LSTM-based language models with different loss functions to identify toxicity in online conversations, where toxicity is defined as anything rude, disrespectful or otherwise likely to make someone leave a discussion.

Computed 4 folds for each of the five models and final inference had to be done in Kaggle kernel in under 2 hours according to the competiton rules.