Code and data associated with the paper "Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German."

Getting Started

  1. Create a new python environment (python >= 3.9 is recommended)
  2. Install the minimum requirements from requirements.txt

Data Release

Data is released under the CC BY 4.0 license.

  • /data contains the seed nouns for our Gender-Fair Dictionary and the Multi-Sentence Multi-Domain mentions in Wikipedia and Europarl.
  • /results contains results for our evaluation of machine-translated passages.

Experiments

We release several bash and python scripts to replicate our experiments. Head to /bash or /src to find most relevant details.

Script names and parameters are mostly self-explanatory, but do not esistate to open an issue to ask for more details (this is the preferred way over email).