VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations Archit Rathore, Sunipa Dev, Jeff M. Phillips, Vivek Srikumar, Yan Zheng, Chin-Chia Michael Yeh, Junpeng Wang, Wei Zhang, Bei Wang. arXiv preprint arXiv:2104.02797, 2021.
Python 3.6+, pip
The web interface is supported only for Chrome and Firefox.
The following libraries are also required to run the code:
flask
sklearn
scipy
numpy
tqdm
To install these libraries using pip, use the following command in the terminal:
pip3 install flask scikit-learn scipy numpy pandas tqdm
To install these packages only for current user (or if you do not write access to the python installation on the machine):
pip3 install flask scikit-learn scipy numpy pandas tqdm --user
Alternately, you can also use conda to install the packages:
conda install flask scikit-learn scipy numpy pandas tqdm
Clone this repository to your local machine, make sure the requirement are installed. Then navigate to the cloned repository and in the base directory, type the following command in the terminal.
git clone https://github.com/tdavislab/verb.git
cd verb
python3 -m flask run
Once the command above is running, open your web browser (Chrome and Firefox supported) and navigate to: http://127.0.0.1:5000/ (or equivalently to: http://localhost:5000)
The project defaults to using GLoVe embeddings of 50 dimensions trained on the Wikipedia 2014 + Gigaword 5 corpus.
We also provide preprocessed data for the GLoVe embeddings from Common Crawl corpus - download the
preprocessed file here,
copy it to the data folder, and rename it to embedding.pkl
to load this data instead.
You can also create your own dataset by changing datapath
variable in the __main__
method of vectors.py
to your own
trained vectors in the GLoVe format.
If you get the following error, it might indicate that you are not in the correct directory. Open a terminal in the base directory of the cloned repository.
Error: Could not locate a Flask application. You did not provide the
"FLASK_APP" environment variable, and a "wsgi.py" or "app.py" module
was not found in the current directory
The tool is written with Python 3.6+ support, and may/may not work with earlier versions of Python3.x. Python 2.x. is not supported.
Make sure you install the requirements before running the application.
We have tested the tool on Firefox and Chrome. There are known issues of point labels not showing correctly in Safari.
This error means that one of the words (denoted by 'xxxx' above) in your provided word set was not found the vocabulary of the word vector embedding. Check the spelling, or use another common word instead.