hate-speech-detection-social-media

This repository contains the thesis titled "Hate Speech Detection in Social Media". Also, the github links containing the code for the experiments.

Thesis Abstract

Social Media platforms are often abused to spread hateful messages. These not only cause harm to
the individual but also to society in general. The staggering volume of content  generated in 
social media across so many countries, regions and languages make it impossible to be moderated
manually. This necessitates that moderation efforts be augmented with automated tools. To this 
end, the thesis aims to aid this effort by developing automated hate speech detection tools. 
Owing to the recent successof deep learning across multiple domains, the thesis develops multiple
deep learning models for detecting hate speech in social media. The thesis develops various such 
models using DNN architectures  like CNN, BiLSTM to the more recent BERT-based state-of-the-art 
pre-trained models. These models are evaluated using datasets not only in English but also in low 
resource languages such as Indian Bengali, Hindi and their code-mixed variants. These datasets are 
collected from various sources like Facebook, Twitter and YouTube. In addition, the thesis also 
studies the detection of online aggression and hate speech identification in internet memes.

Code Links:

Hate Speech Detection in Indo-European Languages : https://github.com/cozek/hasoc-2019-falsepostive
Checkpoint Ensemble of Transformers for Hate Speech Detection : https://github.com/cozek/OffensEval2020-code
Automated Aggression Identification using Transformers : https://github.com/cozek/trac2020_submission
Using Text and Image Features to Classify Internet Memes : https://github.com/cozek/memotion2020-code

cozek/hate-speech-detection-social-media

hate-speech-detection-social-media

Thesis Abstract