♂️ Gender discrimination in Natural Language Processing ♀️

This repository contains a project realized as part of the Ethics in Artificial Intelligence course of the Master's degree in Artificial Intelligence, University of Bologna.

Description

The aim of this project is to develop a proof of concept about how to address the gender discrimination in NLP. Two approaches have been investigated:

Hard-Debiasing on pre-trained Italian Word Embeddings
GN-GloVe which reduce the bias during the training of word embedidngs

In order to have a deeper understanding of the problem, take a look at the presentation of the project.

Repository structure

.
├── data                             # Contains the files of words used for the experiments
├── debiaswe                         # Contains debiasing functions 
│   ├── co_occurrence.py             # Functions to compute the co-occurence matrix for GN-Glove
│   ├── data.py                      # Functions to load data files
│   ├── debias_glove.py              # Actual implementation of GN-Glove debiasing
│   ├── metrics.py                   # Functions to compute metrics for the experiments 
│   └── we.py                        # Auxiliar functions to load and manage word embeddings
├── embeddings                       # Contains the word embeddings file for the hard-debiasing approach
├── scripts                          # Contains the scripts to convert the original twitter word embeddings to a tsv file and fileter 
├── gn-glove_we_visualization.ipynb  # Visualization of the word embeddings generated by GN-Glove
├── hard_debias_italian_we.ipynb     # Visualization of the word embeddings generated by Hard-Debiasing                        
├── presentation.pdf                 # Slides about the project
├── LICENSE
└── README.md

Results

The results of both approaches are presented below:

Hard-Debiasing:
GN-GloVe:

Versioning

We use Git for versioning.

Group members

Name	Surname	Email	Username
Davide	Angelani	`davide.angelani@studio.unibo.it`	qnozo
Eric	Rossetto	`eric.rossetto@studio.unibo.it`	Erhtric
Giuseppe	Murro	`giuseppe.murro@studio.unibo.it`	gmurro
Salvatore	Pisciotta	`salvatore.pisciotta@studio.unibo.it`	SalvoPisciotta
Xiaowei	Wen	`xiaowei.wen@studio.unibo.it`	WenXiaowei

License

This project is licensed under the MIT License - see the LICENSE file for details

Erhtric/debiasing-gender-nlp