Masahiro Kaneko, Danushka Bollegala
Code and debiased word embeddings for the paper: "Gender-preserving Debiasing for Pre-trained Word Embeddings" (In ACL 2019). If you use any part of this work, make sure you include the following citation:
@inproceedings{Kaneko:ACL:2019,
title={Gender-preserving Debiasing for Pre-trained Word Embeddings},
author={Masahiro Kaneko and Danushka Bollegala},
booktitle={Proc. of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2019}
}
- python==3.7.2
- gensim==3.7.1
- numpy==1.16.2
- pandas==0.24.2
- torch==1.1.0
First download the necessary data listed below using following command.
- Trained glove and gn-glove
- SemBias dataset
- Female and male word lists
./download.sh
Perform debiasing for word embeddings and its evaluation. Debiased word embeddings are stored in debiased_embeddings
in both bin and txt format. Name of word embeddings to be debiased as glove
or gn
and give as the first argument. For example,
./run.sh gn
You can also evaluate your word embaddings without training on SemBias:
python eval_word_embeddings.py -i path/to/your/embeddings
You can directly download our debiased GP (glove) and GP (gn-glove).
See the LICENSE file.