/Harmful-LGBTQIA

Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals

Primary LanguageJupyter NotebookMIT LicenseMIT

Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals

Warning: Due to the overall purpose of the study, this repository and paper contains examples of stereotypes, profanity, vulgarity and other harmful languages in figures and tables that may be triggering or disturbing to LGBTQIA+ individuals, activists and allies, and may be distressing for some readers.

Citation

@inproceedings{dacon2022detecting, 
title={Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals}, 
author={Jamell Dacon, Harry Shomer, Shaylynn Crum-Dacon, Jiliang Tang}, 
booktitle={Queer in AI Workshop at NAACL}, 
year={2022} 
}

Here is a BiBTeX citation:

Coming soon...

This paper accepted to Queer in AI Workshop --- as part of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022)

Requirements:

scikit-learn
numpy
pandas
pytorch
transformers

Creator(s)

Name: Jamell Dacon*, Harry Shomer, Shaylynn Crum-Dacon and Jiliang Tang
Corresponding author (*): daconjam at msu dot edu (daconjam@msu.edu)
Email: {shomerha, crumshay, tangjili}@msu.edu

If you publish work(s) based on material and/ or information obtained from this repository, then, in your acknowledgements, please note the assistance you received from utilizing this repository. By citing our paper, feel free to star GitHub stars and/ or fork Github forks the repository so that academics i.e. university researchers, faculty, and hate speech, abusive and offensive language AI and NLP practitoners may have quicker access to the dataset (or code ) in hopes to promote inclusivity of the LGBTQIA+ community with the objective of creating a safe and inclusive place that welcomes, supports, and values all LGBTQIA+ individuals, activists and allies both online and offline.

Our work solely calls forboth AI and NLP practitioners to prioritize collecting a set of relevant labeled training data with several examples to detect and stop harmful online content by interpreting visual content in context. All authors declare that all data used is publicly available and do not intend for increased feelings of marginalization, but attempt to highlight the need for impactful speech and language technologies to readily identify and quickly remove harmful online content to minimize further stigmatization of an already marginalized community.

Addition Information: Correspondence

Personal Page: Homepage (Portfolio)