/toxicity-detection-sklearn

Tried to identify toxicity in comments using various machine learning algorithm and after evaluation chose one.

Primary LanguageJupyter Notebook

Toxicity Detection using Scikit-Learn

Project members

Name:
Sahil Fruitwala
Joel Joseph Thomas
Abhijeet Singh
Punarva Vyas

Note:

  1. All the cleaning and feature engineering processes done on dataset are in master branch.
  2. Get dataset from https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data
  3. For deployment we have used Heroku with CI/CD process. Configuration can be found in backend directory.
  4. Process.ipynb is our final jupyter notebook in which we have all processes at one place. This file can be found in backend folder.
  5. All other files in main branch are individual files as all of us were working using our own logics.
  6. All individual worked files can be found in separate branches which are given by group member name.

1. Home page of website

2. Result of given text

3. Analysis of data before cleaning

4. Second analysis of data before cleaning process

5. Word cloud of most occured word in dataset