This is a project about Toxic Comment Classification using different methods both for English and Greek.
To Run the Code you will need to download these datasets and add them to the Data and Data_GR directories.
glove840b300d -> https://www.kaggle.com/datasets/takuok/glove840b300dtxt
glove6B50d -> https://www.kaggle.com/datasets/watts2/glove6b50dtxt
Additionally, you will need to create the Models and Predictions directories in the NLP_English directory. You will have to do the same for the NLP_Greek directory, but the name will be Models_GR and Predictions_GR.
- Run gpu.py
- Run csv_format.py
- Run data.py
- Run GRU_CNN.py
- Run GRU_RNN.py
- Run LSTM_CNN.py
- Run LSTM_RNN.py
- Run subm.py
- Run test.py
Please see requirements.txt
To run the BERT code you just open Google colab and set Runtime Type to GPU. I would not recommend running it on your machine since it relies a lot on GPU, CPU, and RAM.