hate-speech-recognizer-dl

An hate-speech recognizer implemented using various models from Hugging Face. Some of them were used as they were (pre-trained), some were fine-tuned. Also a language model was trained from scratch.

The list of models are given below:

Pre-trained BERTweet
Pre-trained DistilBERT
Pre-trained AutoNLP
Fine-tuned AutoNLP
Fine-tuned DistilBERT
DistilBERT Base Uncased (trained with hate-speech data)

Dataset: 34574 annotated sentences consisting of hate-speech directed towards gender, race, religion and sexual orientation.

Results

Model	Type	Best Accuracy Rate	Worst Accuracy Rate
BERTweet	`pre-trained`	%98	%2
DistilBERT	`pre-trained`	%66	%66
AutoNLP	`pre-trained`	%47	%79

Model	Type	Accuracy	Precision	Recall	F1-Score
AutoNLP	`fine-tuned`	%92	%90	%93	%91
DistilBERT	`fine-tuned`	%91	%90	%93	%92
DistilBERT Base Uncased	`trained`	%90	%91	%91	%91

User Tesing

All the models were tested with various types of input sentences.

Fine-tuned AutoNLP wasn't able to recognize most of the hateful sentences.
Out of all the pre-trained models, BERTweet performed the best. DistilBERT was second.
For the most part, the trained model and the fine-tuned DistilBERT performed simiarly. The trained model gave better results when test sentences had grammar mistakes. There were also few test cases where the trained model was able to recognize hate-speech while the fine-tuned DistilBERT failed to do so.

This project was done as the 2nd part of our senior design project. 1st part is the same task done with machine learning algorithms. Naive Bayes, SVM and Random Forest was used in the 1st part. Approx. 90% accuracy rate was obtained with SVM.

1st Part

Team members were Pınar Haskırış, Ahsen Amil, Uras Felamur for both parts.

pinarhaskiris/hate-speech-recognizer-dl

hate-speech-recognizer-dl

Results

User Tesing