This page catalogues datasets annotated for hate speech, online abuse, and offensive language. They may be useful for e.g. training a natural language processing system to detect this language.

The list is maintained by Leon Derczynski and Bertie Vidgen.

Please make contributions via pull request or email. Accompanying data statements preferred for all corpora.

If you use these resources, please cite (and read!) our paper: Directions in Abusive Language Training Data: Garbage In, Garbage Out. And if you would like to find other resources for researching online hate, visit The Alan Turing Institute's Online Hate Research Hub or read The Alan Turing Institute's Reading List on Online Hate and Abuse Research.

If you're looking for a good paper on online hate training datasets (beyond our paper, of course!) then have a look at 'Resources and benchmark corpora for hate speech detection: a systematic review' by Poletto et al. in Language Resources and Evaluation.

List of datasets

Arabic

Are They our Brothers? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere

  • Link to publication: https://ieeexplore.ieee.org/document/8508247
  • Link to data: https://github.com/nuhaalbadi/Arabic_hatespeech
  • Task description: Binary (Hate, Not)
  • Details of task: Religious subcategories
  • Size of dataset: 6,136
  • Percentage abusive: 0.45
  • Language: Arabic
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Albadi, N., Kurdi, M. and Mishra, S., 2018. Are they Our Brothers? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere. In: International Conference on Advances in Social Networks Analysis and Mining. Barcelona, Spain: IEEE, pp.69-76.

Multilingual and Multi-Aspect Hate Speech Analysis (Arabic)

  • Link to publication: https://arxiv.org/abs/1908.11049
  • Link to data: https://github.com/HKUST-KnowComp/MLMA_hate_speech
  • Task description: Detailed taxonomy with cross-cutting attributes: Hostility, Directness, Target Attribute, Target Group, How annotators felt on seeing the tweet.
  • Details of task: Gender, Sexual orientation, Religion, Disability
  • Size of dataset: 3,353
  • Percentage abusive: 0.64
  • Language: Arabic
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Ousidhoum, N., Lin, Z., Zhang, H., Song, Y. and Yeung, D., 2019. Multilingual and Multi-Aspect Hate Speech Analysis. ArXiv,.

L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language

  • Link to publication: https://www.aclweb.org/anthology/W19-3512
  • Link to data: https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset
  • Task description: Ternary (Hate, Abusive, Normal)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 5,846
  • Percentage abusive: 0.38
  • Language: Arabic
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Mulki, H., Haddad, H., Bechikh, C. and Alshabani, H., 2019. L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language. In: Proceedings of the Third Workshop on Abusive Language Online. Florence, Italy: Association for Computational Linguistics, pp.111-118.

Abusive Language Detection on Arabic Social Media (Twitter)

  • Link to publication: https://www.aclweb.org/anthology/W17-3008
  • Link to data: http://alt.qcri.org/~hmubarak/offensive/TweetClassification-Summary.xlsx
  • Task description: Ternary (Obscene, Offensive but not obscene, Clean)
  • Details of task: Incivility
  • Size of dataset: 1,100
  • Percentage abusive: 0.59
  • Language: Arabic
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Mubarak, H., Darwish, K. and Magdy, W., 2017. Abusive Language Detection on Arabic Social Media. In: Proceedings of the First Workshop on Abusive Language Online. Vancouver, Canada: Association for Computational Linguistics, pp.52-56.

Abusive Language Detection on Arabic Social Media (Al Jazeera)

  • Link to publication: https://www.aclweb.org/anthology/W17-3008
  • Link to data: http://alt.qcri.org/~hmubarak/offensive/AJCommentsClassification-CF.xlsx
  • Task description: Ternary (Obscene, Offensive but not obscene, Clean)
  • Details of task: Incivility
  • Size of dataset: 32,000
  • Percentage abusive: 0.81
  • Language: Arabic
  • Level of annotation: Posts
  • Platform: AlJazeera
  • Medium: Text
  • Reference: Mubarak, H., Darwish, K. and Magdy, W., 2017. Abusive Language Detection on Arabic Social Media. In: Proceedings of the First Workshop on Abusive Language Online. Vancouver, Canada: Association for Computational Linguistics, pp.52-56.

Dataset Construction for the Detection of Anti-Social Behaviour in Online Communication in Arabic

Croatian

Datasets of Slovene and Croatian Moderated News Comments

  • Link to publication: https://www.aclweb.org/anthology/W18-5116
  • Link to data: http://hdl.handle.net/11356/1202
  • Task description: Binary (Deleted, Not)
  • Details of task: Flagged content
  • Size of dataset: 17,000,000
  • Percentage abusive: 0.02
  • Language: Croatian
  • Level of annotation: Posts
  • Platform: 24sata website
  • Medium: Text
  • Reference: Ljubešić, N., Erjavec, T. and Fišer, D., 2018. Datasets of Slovene and Croatian Moderated News Comments. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Brussels, Belgium: Association for Computational Linguistics, pp.124-131.

Danish

Offensive Language and Hate Speech Detection for Danish

  • Link to publication: http://www.derczynski.com/papers/danish_hsd.pdf
  • Link to data: https://figshare.com/articles/Danish_Hate_Speech_Abusive_Language_data/12220805
  • Task description: Branching structure of tasks: Binary (Offensive, Not), Within Offensive (Target, Not), Within Target (Individual, Group, Other)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 3,600
  • Percentage abusive: 0.12
  • Language: Danish
  • Level of annotation: Posts
  • Platform: Twitter, Reddit, newspaper comments
  • Medium: Text
  • Reference: Sigurbergsson, G. and Derczynski, L., 2019. Offensive Language and Hate Speech Detection for Danish. ArXiv.

English

Automated Hate Speech Detection and the Problem of Offensive Language

Hate Speech Dataset from a White Supremacy Forum

  • Link to publication: https://www.aclweb.org/anthology/W18-5102.pdf
  • Link to data: https://github.com/Vicomtech/hate-speech-dataset
  • Task description: Ternary (Hate, Relation, Not)
  • Details of task: Hate per se
  • Size of dataset: 9,916
  • Percentage abusive: 0.11
  • Language: English
  • Level of annotation: Sentence - with context of the converstaional thread taken into account
  • Platform: Stormfront
  • Medium: Text
  • Reference: de Gibert, O., Perez, N., García-Pablos, A., and Cuadros, M., 2018. Hate Speech Dataset from a White Supremacy Forum. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Brussels, Belgium: Association for Computational Linguistics, pp.11-20.

Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter

  • Link to publication: https://www.aclweb.org/anthology/N16-2013
  • Link to data: https://github.com/ZeerakW/hatespeech
  • Task description: 3-topic (Sexist, Racist, Not)
  • Details of task: Racism, Sexism
  • Size of dataset: 16,914
  • Percentage abusive: 0.32
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Waseem, Z. and Horvy, D., 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In: Proceedings of the NAACL Student Research Workshop. San Diego, California: Association for Computational Linguistics, pp.88-93.

Detecting Online Hate Speech Using Context Aware Models

  • Link to publication: https://arxiv.org/pdf/1710.07395.pdf
  • Link to data: https://github.com/sjtuprog/fox-news-comments
  • Task description: Binary (Hate / not)
  • Details of task: Hate per se
  • Size of dataset: 1528
  • Percentage abusive: 0.28
  • Language: English
  • Level of annotation: Posts
  • Platform: Fox News
  • Medium: Text
  • Reference: Gao, L. and Huang, R., 2018. Detecting Online Hate Speech Using Context Aware Models. ArXiv,.

Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter

  • Link to publication: https://pdfs.semanticscholar.org/3eeb/b7907a9b94f8d65f969f63b76ff5f643f6d3.pdf
  • Link to data: https://github.com/ZeerakW/hatespeech
  • Task description: Multi-topic (Sexist, Racist, Neither, Both)
  • Details of task: Racism, Sexism
  • Size of dataset: 4,033
  • Percentage abusive: 0.16
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Waseem, Z., 2016. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter. In: Proceedings of 2016 EMNLP Workshop on Natural Language Processing and Computational Social Science. Copenhagen, Denmark: Association for Computational Linguistics, pp.138-142.

When Does a Compliment Become Sexist? Analysis and Classification of Ambivalent Sexism Using Twitter Data

  • Link to publication: https://pdfs.semanticscholar.org/225f/f8a6a562bbb64b22cebfcd3288c6b930d1ef.pdf
  • Link to data: https://github.com/AkshitaJha/NLP_CSS_2017
  • Task description: Hierarchy of Sexism (Benevolent sexism, Hostile sexism, None)
  • Details of task: Sexism
  • Size of dataset: 712
  • Percentage abusive: 1
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Jha, A. and Mamidi, R., 2017. When does a Compliment become Sexist? Analysis and Classification of Ambivalent Sexism using Twitter Data. In: Proceedings of the Second Workshop on Natural Language Processing and Computational Social Science. Vancouver, Canada: Association for Computational Linguistics, pp.7-16.

Overview of the Task on Automatic Misogyny Identification at IberEval 2018 (English)

  • Link to publication: http://ceur-ws.org/Vol-2150/overview-AMI.pdf
  • Link to data: https://amiibereval2018.wordpress.com/im nt-dates/data/
  • Task description: Binary (misogyny / not), 5 categories (stereotype, dominance, derailing, sexual harassment, discredit), target of misogyny (active or passive)
  • Details of task: Sexism
  • Size of dataset: 3,977
  • Percentage abusive: 0.47
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Fersini, E., Rosso, P. and Anzovino, M., 2018. Overview of the Task on Automatic Misogyny Identification at IberEval 2018. In: Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018).

CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech (English)

  • Link to publication: https://www.aclweb.org/anthology/P19-1271.pdf
  • Link to data: https://github.com/marcoguerini/CONAN
  • Task description: Binary (Islamophobic / not), multi-topic (Culture, Economics, Crimes, Rapism, Terrorism, Women Oppression, History, Other/generic)
  • Details of task: Islamophobia
  • Size of dataset: 1,288
  • Percentage abusive: 1
  • Language: English
  • Level of annotation: Posts
  • Platform: Synthetic / Facebook
  • Medium: Text
  • Reference: Chung, Y., Kuzmenko, E., Tekiroglu, S. and Guerini, M., 2019. CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, pp.2819-2829.

Characterizing and Detecting Hateful Users on Twitter

  • Link to publication: https://arxiv.org/pdf/1803.08977.pdf
  • Link to data: https://github.com/manoelhortaribeiro/HatefulUsersTwitter
  • Task description: Binary (hateful/not)
  • Details of task: Hate per se
  • Size of dataset: 4,972
  • Percentage abusive: 0.11
  • Language: English
  • Level of annotation: Users
  • Platform: Twitter
  • Medium: Text
  • Reference: Ribeiro, M., Calais, P., Santos, Y., Almeida, V. and Meira, W., 2018. Characterizing and Detecting Hateful Users on Twitter. ArXiv,.

A Benchmark Dataset for Learning to Intervene in Online Hate Speech (Gab)

A Benchmark Dataset for Learning to Intervene in Online Hate Speech (Reddit)

Multilingual and Multi-Aspect Hate Speech Analysis (English)

  • Link to publication: https://arxiv.org/abs/1908.11049
  • Link to data: https://github.com/HKUST-KnowComp/MLMA_hate_speech
  • Task description: Detailed taxonomy with cross-cutting attributes: Hostility, Directness, Target attribute and Target group.
  • Details of task: Gender, Sexual orientation, Religion, Disability
  • Size of dataset: 5,647
  • Percentage abusive: 0.76
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Ousidhoum, N., Lin, Z., Zhang, H., Song, Y. and Yeung, D., 2019. Multilingual and Multi-Aspect Hate Speech Analysis. ArXiv,.

Exploring Hate Speech Detection in Multimodal Publications

  • Link to publication: https://arxiv.org/pdf/1910.03814.pdf
  • Link to data: https://gombru.github.io/2019/10/09/MMHS/
  • Task description: Six primary categories (No attacks to any community, Racist, Sexist, Homophobic, Religion based attack, Attack to other community)
  • Details of task: Racism, Sexism, Homophobia, Religion-based attack
  • Size of dataset: 149,823
  • Percentage abusive: 0.25
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text and Images/Memes
  • Reference: Gomez, R., Gibert, J., Gomez, L. and Karatzas, D., 2019. Exploring Hate Speech Detection in Multimodal Publications. ArXiv,.

Predicting the Type and Target of Offensive Posts in Social Media

  • Link to publication: https://arxiv.org/pdf/1902.09666.pdf
  • Link to data: [http://competitions.codalab.org/ competitions/20011](http://competitions.codalab.org/ competitions/20011)
  • Task description: Branching structure of tasks: Binary (Offensive, Not), Within Offensive (Target, Not), Within Target (Individual, Group, Other)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 14,100
  • Percentage abusive: 0.33
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N. and Kumar, R., 2019. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). ArXiv,.

hatEval, SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (English)

  • Link to publication: https://www.aclweb.org/anthology/S19-2007
  • Link to data: http://competitions.codalab.org/competitions/19935
  • Task description: Branching structure of tasks: Binary (Hate, Not), Within Hate (Group, Individual), Within Hate (Agressive, Not)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 13,000
  • Percentage abusive: 0.4
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Pardo, F., Rosso, P. and Sanguinetti, M., 2019. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Minneapolis, Minnesota: Association for Computational Linguistics, pp.54-63.

Peer to Peer Hate: Hate Speech Instigators and Their Targets

  • Link to publication: https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17905/16996
  • Link to data: https://github.com/mayelsherif/hate_speech_icwsm18
  • Task description: Binary (Hate/Not), only for tweets which have both a Hate Instigator and Hate Target
  • Details of task: Hate per se
  • Size of dataset: 27,330
  • Percentage abusive: 0.98
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: ElSherief, M., Nilizadeh, S., Nguyen, D., Vigna, G. and Belding, E., 2018. Peer to Peer Hate: Hate Speech Instigators and Their Targets. In: Proceedings of the Twelfth International AAAI Conference on Web and Social Media (ICWSM 2018). Santa Barbara, California: University of California, pp.52-61.

Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages

  • Link to publication: https://dl.acm.org/doi/pdf/10.1145/3368567.3368584?download=true
  • Link to data: https://hasocfire.github.io/hasoc/2019/dataset.html
  • Task description: Branching structure of tasks. A: Hate / Offensive or Neither, B: Hatespeech, Offensive, or Profane, C: Targeted or Untargeted
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 7,005
  • Percentage abusive: 0.36
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter and Facebook
  • Medium: Text
  • Reference: Mandl, T., Modha, S., Majumder, P., Patel, D., Dave, M., Mandlia, C. and Patel, A., 2019. Overview of the HASOC track at FIRE 2019. In: Proceedings of the 11th Forum for Information Retrieval Evaluation,.

Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior

  • Link to publication: https://arxiv.org/pdf/1802.00393.pdf
  • Link to data: https://dataverse.mpi-sws.org/dataset.xhtml?persistentId=doi:10.5072/FK2/ZDTEMN
  • Task description: Multi-thematic (Abusive, Hateful, Normal, Spam)
  • Details of task: Hate per se
  • Size of dataset: 80,000
  • Percentage abusive: 0.18
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Annotation process: Very detailed information is given: multiple rounds, using a smaller 300 tweet dataset for testing the schema. For the final 80k, 5 judgements per tweet. CrowdFlower
  • Annotation agreement: 55.9% = 4/5, 36.6% = 3/5, 7.5% = 2/5
  • Reference: Founta, A., Djouvas, C., Chatzakou, D., Leontiadis, I., Blackburn, J., Stringhini, G., Vakali, A., Sirivianos, M. and Kourtellis, N., 2018. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior. ArXiv,.

A Large Labeled Corpus for Online Harassment Research

  • Link to publication: http://www.cs.umd.edu/~golbeck/papers/trolling.pdf
  • Link to data: jgolbeck@umd.edu
  • Task description: Binary (Harassment, Not)
  • Details of task: Person-directed
  • Size of dataset: 35,000
  • Percentage abusive: 0.16
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Golbeck, J., Ashktorab, Z., Banjo, R., Berlinger, A., Bhagwan, S., Buntain, C., Cheakalos, P., Geller, A., Gergory, Q., Gnanasekaran, R., Gnanasekaran, R., Hoffman, K., Hottle, J., Jienjitlert, V., Khare, S., Lau, R., Martindale, M., Naik, S., Nixon, H., Ramachandran, P., Rogers, K., Rogers, L., Sarin, M., Shahane, G., Thanki, J., Vengataraman, P., Wan, Z. and Wu, D., 2017. A Large Labeled Corpus for Online Harassment Research. In: Proceedings of the 2017 ACM on Web Science Conference. New York: Association for Computing Machinery, pp.229-233.

Ex Machina: Personal Attacks Seen at Scale, Personal attacks

  • Link to publication: https://arxiv.org/pdf/1610.08914
  • Link to data: https://github.com/ewulczyn/wiki-detox
  • Task description: Binary (Personal attack, Not)
  • Details of task: Person-directed
  • Size of dataset: 115,737
  • Percentage abusive: 0.12
  • Language: English
  • Level of annotation: Posts
  • Platform: Wikipedia
  • Medium: Text
  • Reference: Wulczyn, E., Thain, N. and Dixon, L., 2017. Ex Machina: Personal Attacks Seen at Scale. ArXiv,.

Ex Machina: Personal Attacks Seen at Scale, Toxicity

  • Link to publication: https://arxiv.org/pdf/1610.08914
  • Link to data: https://github.com/ewulczyn/wiki-detox
  • Task description: Toxicity/healthiness judgement (-2 == very toxic, 0 == neutral, 2 == very healthy)
  • Details of task: Person-directed
  • Size of dataset: 100,000
  • Percentage abusive: NA
  • Language: English
  • Level of annotation: Posts
  • Platform: Wikipedia
  • Medium: Text
  • Reference: Wulczyn, E., Thain, N. and Dixon, L., 2017. Ex Machina: Personal Attacks Seen at Scale. ArXiv,.

Detecting cyberbullying in online communities (World of Warcraft)

  • Link to publication: http://aisel.aisnet.org/ecis2016_rp/61/
  • Link to data: http://ub-web.de/research/
  • Task description: Binary (Harassment, Not)
  • Details of task: Person-directed
  • Size of dataset: 16,975
  • Percentage abusive: 0.01
  • Language: English
  • Level of annotation: Posts
  • Platform: World of Warcraft
  • Medium: Text
  • Reference: Bretschneider, U. and Peters, R., 2016. Detecting Cyberbullying in Online Communities. Research Papers, 61.

Detecting cyberbullying in online communities (League of Legends)

  • Link to publication: http://aisel.aisnet.org/ecis2016_rp/61/
  • Link to data: http://ub-web.de/research/
  • Task description: Binary (Harassment, Not)
  • Details of task: Person-directed
  • Size of dataset: 17,354
  • Percentage abusive: 0.01
  • Language: English
  • Level of annotation: Posts
  • Platform: League of Legends
  • Medium: Text
  • Reference: Bretschneider, U. and Peters, R., 2016. Detecting Cyberbullying in Online Communities. Research Papers, 61.

A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research

  • Link to publication: https://arxiv.org/pdf/1802.09416.pdf
  • Link to data: https://github.com/Mrezvan94/Harassment-Corpus
  • Task description: Multi-topic harassment detection
  • Details of task: Racism, Sexism, Appearance-related, Intellectual, Political
  • Size of dataset: 24,189
  • Percentage abusive: 0.13
  • Language: English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Rezvan, M., Shekarpour, S., Balasuriya, L., Thirunarayan, K., Shalin, V. and Sheth, A., 2018. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. ArXiv,.

Ex Machina: Personal Attacks Seen at Scale, Aggression and Friendliness

  • Link to publication: https://arxiv.org/pdf/1610.08914
  • Link to data: https://github.com/ewulczyn/wiki-detox
  • Task description: Aggression/friendliness judgement on a 5 point scale. (-2 == very aggressive, 0 == neutral, 3 == very friendly).
  • Details of task: Person-Directed + Group-Directed
  • Size of dataset: 160,000
  • Percentage abusive: NA
  • Language: English
  • Level of annotation: Posts
  • Platform: Wikipedia
  • Medium: Text
  • Reference: Wulczyn, E., Thain, N. and Dixon, L., 2017. Ex Machina: Personal Attacks Seen at Scale. ArXiv,.

French

CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech (French)

  • Link to publication: https://www.aclweb.org/anthology/P19-1271.pdf
  • Link to data: https://github.com/marcoguerini/CONAN
  • Task description: Binary (Islamophobic / not), Multi-topic (Culture, Economics, Crimes, Rapism, Terrorism, Women Oppression, History, Other/generic)
  • Details of task: Islamophobia
  • Size of dataset: 1,719
  • Percentage abusive: 1
  • Language: French
  • Level of annotation: Posts
  • Platform: Synthetic / Facebook
  • Medium: Text
  • Reference: Chung, Y., Kuzmenko, E., Tekiroglu, S. and Guerini, M., 2019. CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, pp.2819-2829.

Multilingual and Multi-Aspect Hate Speech Analysis (French)

  • Link to publication: https://arxiv.org/abs/1908.11049
  • Link to data: https://github.com/HKUST-KnowComp/MLMA_hate_speech
  • Task description: Detailed taxonomy with cross-cutting attributes: Hostility, Directness, Target Attribute, Target Group, How annotators felt on seeing the tweet.
  • Details of task: Gender, Sexual orientation, Religion, Disability
  • Size of dataset: 4,014
  • Percentage abusive: 0.72
  • Language: French
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Ousidhoum, N., Lin, Z., Zhang, H., Song, Y. and Yeung, D., 2019. Multilingual and Multi-Aspect Hate Speech Analysis. ArXiv,.

German

Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis

  • Link to publication: https://arxiv.org/pdf/1701.08118.pdf
  • Link to data: https://github.com/UCSM-DUE/IWG_hatespeech_public
  • Task description: Binary (Anti-refugee hate, None)
  • Details of task: Refugees
  • Size of dataset: 469
  • Percentage abusive: NA
  • Language: German
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Ross, B., Rist, M., Carbonell, G., Cabrera, B., Kurowsky, N. and Wojatzki, M., 2017. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis. ArXiv,.

Detecting Offensive Statements Towards Foreigners in Social Media

  • Link to publication: https://pdfs.semanticscholar.org/23dc/df7c7e82807445afd9f19474fc0a3d8169fe.pdf
  • Link to data: http://ub-web.de/research/
  • Task description: Hierarchical (Anti-foreigner prejudice, split into (1) slightly offensive/offensive and (2) explicitly/substantially offensive). 6 targets (Foreigner, Government, Press, Community, Other, Unknown)
  • Details of task: Anti-foreigner prejudice
  • Size of dataset: 5,836
  • Percentage abusive: 0.11
  • Language: German
  • Level of annotation: Posts
  • Platform: Facebook
  • Medium: Text
  • Reference: Bretschneider, U. and Peters, R., 2017. Detecting Offensive Statements towards Foreigners in Social Media. In: Proceedings of the 50th Hawaii International Conference on System Sciences.

GermEval 2018

Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages

  • Link to publication: https://dl.acm.org/doi/pdf/10.1145/3368567.3368584?download=true
  • Link to data: https://hasocfire.github.io/hasoc/2019/dataset.html
  • Task description: A: Hate / Offensive or neither, B: Hatespeech, Offensive, or Profane
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 4,669
  • Percentage abusive: 0.24
  • Language: German
  • Level of annotation: Posts
  • Platform: Twitter and Facebook
  • Medium: Text
  • Reference: Mandl, T., Modha, S., Majumder, P., Patel, D., Dave, M., Mandlia, C. and Patel, A., 2019. Overview of the HASOC track at FIRE 2019. In: Proceedings of the 11th Forum for Information Retrieval Evaluation,.

Greek

Deep Learning for User Comment Moderation, Flagged Comments

Deep Learning for User Comment Moderation, Moderated Comments

  • Link to publication: https://www.aclweb.org/anthology/W17-3004
  • Link to data: http://www.straintek.com/data/
  • Task description: Binary (Flagged, Not)
  • Details of task: Flagged content
  • Size of dataset: 1,500
  • Percentage abusive: 0.22
  • Language: Greek
  • Level of annotation: Posts
  • Platform: Gazetta
  • Medium: text
  • Reference: Pavlopoulos, J., Malakasiotis, P. and Androutsopoulos, I., 2017. Deep Learning for User Comment Moderation. In: Proceedings of the First Workshop on Abusive Language Online. Vancouver, Canada: Association for Computational Linguistics, pp.25-35.

Offensive Language Identification in Greek

  • Link to publication: https://arxiv.org/pdf/2003.07459v1.pdf
  • Link to data: https://sites.google.com/site/offensevalsharedtask/home
  • Task description: Branching structure of tasks: Binary (Offensive, Not), Within Offensive (Target, Not), Within Target (Individual, Group, Other)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 4779
  • Percentage abusive: 0.29
  • Language: Greek
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Pitenis, Z., Zampieri, M. and Ranasinghe, T., 2020. Offensive Language Identification in Greek. ArXiv.

Hindi-English

Aggression-annotated Corpus of Hindi-English Code-mixed Data

  • Link to publication: https://arxiv.org/pdf/1803.09402
  • Link to data: https://github.com/kraiyani/Facebook-Post-Aggression-Identification
  • Task description: 3 part hierachy for hate (None, Covert Aggression, Overt Aggression), 4 part target categorisation (Physical threat, Sexual threat, Identity threat, Non-threatening aggression), 3-part discursive role categorisation (Attack, Defend, Abet)
  • Details of task: Numerous sub-categorizations
  • Size of dataset: 18,000
  • Percentage abusive: 0.06
  • Language: Hindi-English
  • Level of annotation: Posts
  • Platform: Facebook
  • Medium: Text
  • Reference: Kumar, R., Reganti, A., Bhatia, A. and Maheshwari, T., 2018. Aggression-annotated Corpus of Hindi-English Code-mixed Data. ArXiv,.

Aggression-annotated Corpus of Hindi-English Code-mixed Data

  • Link to publication: https://arxiv.org/pdf/1803.09402
  • Link to data: https://github.com/kraiyani/Facebook-Post-Aggression-Identification
  • Task description: 3 part hierachy for hate (None, Covert Aggression, Overt Aggression), 4 part target categorisation (Physical threat, Sexual threat, Identity threat, Non-threatening aggression), 3-part discursive role categorisation (Attack, Defend, Abet)
  • Details of task: Numerous sub-categorizations
  • Size of dataset: 21,000
  • Percentage abusive: 0.27
  • Language: Hindi-English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Kumar, R., Reganti, A., Bhatia, A. and Maheshwari, T., 2018. Aggression-annotated Corpus of Hindi-English Code-mixed Data. ArXiv,.

Did You Offend Me? Classification of Offensive Tweets in Hinglish Language

  • Link to publication: https://www.aclweb.org/anthology/W18-5118
  • Link to data: https://github.com/pmathur5k10/Hinglish-Offensive-Text-Classification
  • Task description: Hierarchy (Not Offensive, Abusive, Hate)
  • Details of task: Sexism
  • Size of dataset: 3,189
  • Percentage abusive: 0.65
  • Language: Hindi-English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Mathur, P., Sawhney, R., Ayyar, M. and Shah, R., 2018. Did you offend me? Classification of Offensive Tweets in Hinglish Language. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Brussels, Belgium: Association for Computational Linguistics, pp.138-148.

A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection

  • Link to publication: https://www.aclweb.org/anthology/W18-1105
  • Link to data: https://github.com/deepanshu1995/HateSpeech-Hindi-English-Code-Mixed-Social-Media-Text
  • Task description: Binary (Hate, Not)
  • Details of task: Hate per se
  • Size of dataset: 4,575
  • Percentage abusive: 0.36
  • Language: Hindi-English
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Bohra, A., Vijay, D., Singh, V., Sarfaraz Akhtar, S. and Shrivastava, M., 2018. A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection. In: Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media. New Orleans, Louisiana: Association for Computational Linguistics, pp.36-41.

Overview of the HASOC track at FIRE 2019: Hate Speech and Offensive Content Identification in Indo-European Languages

  • Link to publication: https://dl.acm.org/doi/pdf/10.1145/3368567.3368584?download=true
  • Link to data: https://hasocfire.github.io/hasoc/2019/dataset.htm
  • Task description: A: Hate, Offensive or Neither, B: Hatespeech, Offensive, or Profane, C: Targeted or Untargeted
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 5,983
  • Percentage abusive: 0.51
  • Language: Hindi
  • Level of annotation: Posts
  • Platform: Twitter and Facebook
  • Medium: Text
  • Reference: Mandl, T., Modha, S., Majumder, P., Patel, D., Dave, M., Mandlia, C. and Patel, A., 2019. Overview of the HASOC track at FIRE 2019. In: Proceedings of the 11th Forum for Information Retrieval Evaluation,.

Indonesian

Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study

  • Link to publication: https://ieeexplore.ieee.org/document/8355039
  • Link to data: https://github.com/ialfina/id-hatespeech-detection
  • Task description: Binary (Hate, Not)
  • Details of task: Hate per se
  • Size of dataset: 713
  • Percentage abusive: 0.36
  • Language: Indonesian
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Alfina, I., Mulia, R., Fanany, M. and Ekanata, Y., 2017. Hate Speech Detection in the Indonesian Language: A Dataset and Preliminary Study. In: International Conference on Advanced Computer Science and Information Systems. pp.233-238.

Multi-Label Hate Speech and Abusive Language Detection in Indonesian Twitter

  • Link to publication: https://www.aclweb.org/anthology/W19-3506
  • Link to data: https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection
  • Task description: (No hate speech, No hate speech but abusive, Hate speech but no abuse, Hate speech and abuse), within hate, category (Religion/creed, Race/ethnicity, Physical/disability, Gender/sexual orientation, Other invective/slander), within hate, strength (Weak, Moderate and Strong)
  • Details of task: Religion, Race, Disability, Gender
  • Size of dataset: 13,169
  • Percentage abusive: 0.42
  • Language: Indonesian
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Okky Ibrohim, M. and Budi, I., 2019. Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. In: Proceedings of the Third Workshop on Abusive Language Online. Florence, Italy: Association for Computational Linguistics, pp.46-57.

A Dataset and Preliminaries Study for Abusive Language Detection in Indonesian Social Media

Italian

An Italian Twitter Corpus of Hate Speech against Immigrants

  • Link to publication: https://www.aclweb.org/anthology/L18-1443
  • Link to data: https://github.com/msang/hate-speech-corpus
  • Task description: Binary (Immigrants/Roma/Muslims, Not), additional categories. Within Hate, Intensity measurement (Aggressiveness: No, Weak, Strong, Offensiveness: No, Weak, Strong, Irony: No, Yes, Stereotype: No, Yes, Incitement degree: 0-4)
  • Details of task: Immigrants, Roma and Muslims + numerous sub-categorizations
  • Size of dataset: 1,827
  • Percentage abusive: 0.13
  • Language: Italian
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Sanguinetti, M., Poletto, F., Bosco, C., Patti, V. and Stranisci, M., 2018. An Italian Twitter Corpus of Hate Speech against Immigrants. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA).

Overview of the EVALITA 2018 Hate Speech Detection Task (Facebook)

  • Link to publication: http://ceur-ws.org/Vol-2263/paper010.pdf
  • Link to data: http://www.di.unito.it/~tutreeb/haspeede-evalita18/data.html
  • Task description: Binary (Hate, Not), Within hate for Facebook only, strength (No hate, Weak hate, Strong hate) and theme ((1) religion, (2) physical and/or mental handicap, (3) socio-economic status, (4) politics, (5) race, (6) sex and gender, (7) Other)
  • Details of task: Religion, physical and/or mental handicap, socio-economic status, politics, race, sex and gender
  • Size of dataset: 4,000
  • Percentage abusive: 0.51
  • Language: Italian
  • Level of annotation: Posts
  • Platform: Facebook
  • Medium: Text
  • Reference: Bosco, C., Dell'Orletta, F. and Poletto, F., 2018. Overview of the EVALITA 2018 Hate Speech Detection Task. In: EVALITA 2018-Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. CEUR, pp.1-9.

Overview of the EVALITA 2018 Hate Speech Detection Task (Twitter)

  • Link to publication: http://ceur-ws.org/Vol-2263/paper010.pdf
  • Link to data: http://www.di.unito.it/~tutreeb/haspeede-evalita18/data.html
  • Task description: Binary (Hate, Not), Within Hate For Twitter only Intensity (1-4 rating), Aggressiveness (No, Weak, Strong), Offensiveness (No, Weak, Strong), Irony (Yes, No)
  • Details of task: Group-directed
  • Size of dataset: 4,000
  • Percentage abusive: 0.32
  • Language: Italian
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Bosco, C., Dell'Orletta, F. and Poletto, F., 2018. Overview of the EVALITA 2018 Hate Speech Detection Task. In: EVALITA 2018-Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. CEUR, pp.1-9.

CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech (Italian)

  • Link to publication: https://www.aclweb.org/anthology/P19-1271.pdf
  • Link to data: https://github.com/marcoguerini/CONAN
  • Task description: Binary (Islamophobic, Not), Multi-topic (Culture, Economics, Crimes, Rapism, Terrorism, Women Oppression, History, Other/generic)
  • Details of task: Islamophobia
  • Size of dataset: 1,071
  • Percentage abusive: 1
  • Language: Italian
  • Level of annotation: Posts
  • Platform: Synthetic / Facebook
  • Medium: Text
  • Reference: Chung, Y., Kuzmenko, E., Tekiroglu, S. and Guerini, M., 2019. CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, pp.2819-2829.

Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying

  • Link to publication: https://www.aclweb.org/anthology/W18-5107
  • Link to data: https://github.com/dhfbk/WhatsApp-Dataset
  • Task description: Binary (Cyberbullying, Not)
  • Details of task: Person-directed
  • Size of dataset: 14,600
  • Percentage abusive: 0.08
  • Language: Italian
  • Level of annotation: Posts, structured into 10 chats, with token level information
  • Platform: Synthetic / Whatsapp
  • Medium: Text
  • Reference: Sprugnoli, R., Menini, S., Tonelli, S., Oncini, F. and Piras, E., 2018. Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2) Month: October. Brussels, Belgium: Association for Computational Linguistics, pp.51-59.

Polish

Results of the PolEval 2019 Shared Task 6:First Dataset and Open Shared Task for Automatic Cyberbullying Detection in Polish Twitter

  • Link to publication: http://poleval.pl/files/poleval2019.pdf
  • Link to data: http://poleval.pl/tasks/task6
  • Task description: Harmfulness score (three values), Multilabel from seven phenomena
  • Details of task: Person-directed
  • Size of dataset: 10,041
  • Percentage abusive: 0.09
  • Language: Polish
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Ogrodniczuk, M. and Kobyliński, L., 2019. Results of the PolEval 2019 Shared Task 6: First Dataset and Open Shared Task for Automatic Cyberbullying Detection in Polish Twitter. In: Proceedings of the PolEval 2019 Workshop. Warszawa: Institute of Computer Science, Polish Academy of Sciences.

Portuguese

A Hierarchically-Labeled Portuguese Hate Speech Dataset

  • Link to publication: https://www.aclweb.org/anthology/W19-3510
  • Link to data: https://b2share.eudat.eu/records/9005efe2d6be4293b63c3cffd4cf193e
  • Task description: Binary (Hate, Not), Multi-level (81 categories, identified inductively; categories have different granularities and content can be assigned to multiple categories at once)
  • Details of task: Multiple identities inductively categorized
  • Size of dataset: 3,059
  • Percentage abusive: 0.32
  • Language: Portuguese
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Fortuna, P., Rocha da Silva, J., Soler-Company, J., Warner, L. and Nunes, S., 2019. A Hierarchically-Labeled Portuguese Hate Speech Dataset. In: Proceedings of the Third Workshop on Abusive Language Online. Florence, Italy: Association for Computational Linguistics, pp.94-104.

Offensive Comments in the Brazilian Web: A Dataset and Baseline Results

  • Link to publication: http://www.each.usp.br/digiampietri/BraSNAM/2017/p04.pdf
  • Link to data: https://github.com/rogersdepelle/OffComBR
  • Task description: Binary (Offensive, Not), Target (Xenophobia, homophobia, sexism, racism, cursing, religious intolerance)
  • Details of task: Religion/creed, Race/ethnicity, Physical/disability, Gender/sexual orientation
  • Size of dataset: 1,250
  • Percentage abusive: 0.33
  • Language: Portuguese
  • Level of annotation: Posts
  • Platform: g1.globo.com
  • Medium: Text
  • Reference: de Pelle, R. and Moreira, V., 2017. Offensive Comments in the Brazilian Web: A Dataset and Baseline Results. In: VI Brazilian Workshop on Social Network Analysis and Mining. SBC.

Slovene

Datasets of Slovene and Croatian Moderated News Comments

  • Link to publication: https://www.aclweb.org/anthology/W18-5116
  • Link to data: http://hdl.handle.net/11356/1201
  • Task description: Binary (Deleted, Not)
  • Details of task: Flagged content
  • Size of dataset: 7,600,000
  • Percentage abusive: 0.08
  • Language: Slovene
  • Level of annotation: Posts
  • Platform: MMC RTV website
  • Medium: Text
  • Reference: Ljubešić, N., Erjavec, T. and Fišer, D., 2018. Datasets of Slovene and Croatian Moderated News Comments. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Brussels, Belgium: Association for Computational Linguistics, pp.124-131.

Spanish

Overview of MEX-A3T at IberEval 2018: Authorship and Aggressiveness Analysis in Mexican Spanish Tweets

  • Link to publication: http://ceur-ws.org/Vol-2150/overview-mex-a3t.pdf
  • Link to data: https://mexa3t.wixsite.com/home/aggressive-detection-track
  • Task description: Binary (Aggressive, Not)
  • Details of task: Group-directed
  • Size of dataset: 11,000
  • Percentage abusive: 0.32
  • Language: Spanish
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Alvarez-Carmona, M., Guzman-Falcon, E., Montes-y-Gomez, M., Escalante, H., Villasenor-Pineda, L., Reyes-Meza, V. and Rico-Sulayes, A., 2018. Overview of MEX-A3T at IberEval 2018: Authorship and aggressiveness analysis in Mexican Spanish tweets. In: Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018).

Overview of the Task on Automatic Misogyny Identification at IberEval 2018 (Spanish)

  • Link to publication: http://ceur-ws.org/Vol-2150/overview-AMI.pdf
  • Link to data: https://amiibereval2018.wordpress.com/important-dates/data/
  • Task description: Binary (Misogyny, Not), 5 categories (Stereotype, Dominance, Derailing, Sexual harassment, Discredit), Target of misogyny (Active or Passive)
  • Details of task: Sexism
  • Size of dataset: 4,138
  • Percentage abusive: 0.5
  • Language: Spanish
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Fersini, E., Rosso, P. and Anzovino, M., 2018. Overview of the Task on Automatic Misogyny Identification at IberEval 2018. In: Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018).

hatEval, SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter (Spanish)

  • Link to publication: https://www.aclweb.org/anthology/S19-2007
  • Link to data: competitions.codalab.org/competitions/19935
  • Task description: Branching structure of tasks: Binary (Hate, Not), Within Hate (Group, Individual), Within Hate (Agressive, Not)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 6,600
  • Percentage abusive: 0.4
  • Language: Spanish
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Pardo, F., Rosso, P. and Sanguinetti, M., 2019. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Minneapolis, Minnesota: Association for Computational Linguistics, pp.54-63.

Turkish

A Corpus of Turkish Offensive Language on Social Media

  • Link to publication: https://coltekin.github.io/offensive-turkish/troff.pdf
  • Link to data: https://sites.google.com/site/offensevalsharedtask/home
  • Task description: Branching structure of tasks: Binary (Hate, Not), Within Hate (Group, Individual), Within Hate (Agressive, Not)
  • Details of task: Group-directed + Person-directed
  • Size of dataset: 36232
  • Percentage abusive: 0.19
  • Language: Turkish
  • Level of annotation: Posts
  • Platform: Twitter
  • Medium: Text
  • Reference: Çöltekin, C., 2020. A Corpus of Turkish Offensive Language on Social Media. In: Proceedings of the 12th International Conference on Language Resources and Evaluation.

Lists of abusive keywords

  1. Hatebase

    • "Researchers are encouraged to take advantage of Hatebase's vocabulary dataset, which is a valuable lexicon for searching other data repositories such as public forums, as well as Hatebase's sightings dataset, which is useful for trending analysis"
    • Data link: hatebase.org/academia
  2. Hurtlex

  3. Gorrell et al.

  4. Wiegand et al.

  5. Chandrasekharan et al.


This page is http://hatespeechdata.com/.