Question_Classification

Notebooks dealing with the identification of insincere posts on websites. Quora uses a combination of machine learning and manual review to identify toxic content. For this project around 1300000 questions from Quora are used to develop and test a model. For the data see the Kaggle challenge [https://www.kaggle.com/c/quora-insincere-questions-classification].

Two approaches have been implemented. The first implements tokenization from scratch, whereas the second uses the Keras library.

The net is implemented in PyTorch. It consists of an LSTM cell and two linear layers.

Files

Exploration
Questions_questions_final
quora_questions_classifier - model from final run

daved01/Question_Classification

Question_Classification

Files