This is an Recurrent Neural Network model built to classify Toxic Comments commented on the WikiPedia Plaftorm. The submission is done in the Kaggle Competition.
The project aims to build a machine learning model that can classify online comments as toxic or non-toxic based on their content. The model is designed to help identify and filter out harmful comments that can lead to online harassment, hate speech, or bullying.
Dataset: The project uses a public dataset called the "Toxic Comment Classification Challenge" from Kaggle, which contains over 150,000 comments labeled as toxic or non-toxic. The dataset is preprocessed and cleaned to remove irrelevant information and normalize the text.
The project uses a deep learning approach based on LSTM (Long Short-Term Memory) neural networks, which are well-suited for sequence data processing and can capture long-term dependencies in text. The LSTM model is trained on the preprocessed dataset using a binary classification task, where the goal is to predict whether a comment is toxic or not based on its content.