/Toxic-comments-classification

The dataset contains Wikipedia comments which have been labeled by human raters for toxic behavior.

Primary LanguageJupyter Notebook

Toxic comments classification

This dataset comprise of Twitter comments which have been labeled by human raters for toxic behavior. The types of toxicity are:

  • toxic
  • severe_toxic
  • obscene
  • threat
  • insult
  • identity_hate

File descriptions

  • train.csv: the training set, contains comments with their binary labels.
  • test.csv: the test set, you must predict the toxicity probabilities for these comments. To deter hand labeling, the test set contains some comments which are not included in scoring.
  • sample_submission.csv: a sample submission file in the correct format.
  • test_labels.csv: labels for the test data; value of -1 indicates it was not used for scoring.