An hate-speech recognizer implemented using various models from Hugging Face. Some of them were used as they were (pre-trained), some were fine-tuned. Also a language model was trained from scratch.
The list of models are given below:
- Pre-trained BERTweet
- Pre-trained DistilBERT
- Pre-trained AutoNLP
- Fine-tuned AutoNLP
- Fine-tuned DistilBERT
- DistilBERT Base Uncased (trained with hate-speech data)
Dataset: 34574 annotated sentences consisting of hate-speech directed towards gender, race, religion and sexual orientation.
Model | Type | Best Accuracy Rate | Worst Accuracy Rate |
---|---|---|---|
BERTweet | pre-trained |
%98 | %2 |
DistilBERT | pre-trained |
%66 | %66 |
AutoNLP | pre-trained |
%47 | %79 |
Model | Type | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|
AutoNLP | fine-tuned |
%92 | %90 | %93 | %91 |
DistilBERT | fine-tuned |
%91 | %90 | %93 | %92 |
DistilBERT Base Uncased | trained |
%90 | %91 | %91 | %91 |
All the models were tested with various types of input sentences.
- Fine-tuned AutoNLP wasn't able to recognize most of the hateful sentences.
- Out of all the pre-trained models, BERTweet performed the best. DistilBERT was second.
- For the most part, the trained model and the fine-tuned DistilBERT performed simiarly. The trained model gave better results when test sentences had grammar mistakes. There were also few test cases where the trained model was able to recognize hate-speech while the fine-tuned DistilBERT failed to do so.
This project was done as the 2nd part of our senior design project. 1st part is the same task done with machine learning algorithms. Naive Bayes, SVM and Random Forest was used in the 1st part. Approx. 90% accuracy rate was obtained with SVM.
Team members were Pınar Haskırış, Ahsen Amil, Uras Felamur for both parts.