Evaluating Gender Bias in Machine Translation

This repository is an extension of the work presented in Evaluating Gender Bias in Machine Translation by Gabriel Stanovsky, Noah A. Smith, and Luke Zettlemoyer (ACL 2019), and Gender Coreference and Bias Evaluation at WMT 2020 by Tom Kocmi, Tomasz Limisiewicz, and Gabriel Stanovsky (WMT2020).

Our project builds upon the foundational research by addressing additional biases and incorporating support for Portuguese, reflecting our commitment to enhancing fairness in machine translation across diverse languages.

Requirements

fast_align: install and point an environment variable called FAST_ALIGN_BASE to its root folder (the one containing the build folder).

Installation

Create a Conda environment:

conda create -n mypython3 python=3.8
source activate mypython3
conda install anaconda

Clone the mt_gender and fast_align repositories:

git clone https://github.com/gabrielStanovsky/mt_gender.git
git clone https://github.com/clab/fast_align.git
conda install cmake

Compile fast_align:

cd fast_align
mkdir -p build
cd build
cmake ..
make

Check if it was installed properly:

cd ../../ && fast_align/build/fast_align

Set the environment variable FAST_ALIGN_BASE to the root folder of fast_align:
```
export FAST_ALIGN_BASE=/path/to/fast_align
```

Project Changes

In this updated version of the project, the following significant enhancements have been made:

Error Correction: Numerous errors identified in the original project have been corrected to enhance the stability and accuracy of the evaluations.
Language Support: Added comprehensive support for the Portuguese language, facilitating the assessment of gender bias in Portuguese translations, thereby broadening the applicability of the project.
Project unbIAs: These changes were made as part of the initiative under the unbIAs project, which aims to reduce biases in artificial intelligence systems. This alignment with unbIAs underscores our commitment to promoting fairness in AI technologies.

How to Run

After completing the installation steps:

Ensure all dependencies are installed by running:
```
pip install -r requirements.txt
```
Configure the necessary environment variables as described in the Installation section.

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en.txt ../../winomtout &> ../../winomtout/baseline

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en_pro.txt ../../winomtout &> ../../winomtout/pro

For the general gender accuracy number, run:

 cd /content/mt_gender/src &&  ../scripts/evaluate_all_languages.sh ../data/aggregates/en_anti.txt ../../winomtout &> ../../winomtout/anti

For detailed step-by-step instructions, refer to the provided notebook (WinoMT_Scores_add_portuguese.ipynb), which includes specific configurations and examples.

License

This project uses the following license: MIT.

ramos-ai/winopt

Evaluating Gender Bias in Machine Translation

Requirements

Installation

Project Changes

How to Run

License