This is the public repository for my course project for Artificial Intelligence: Cognitive Systems.
Research on stereotypes propagated by LLMs at the interface between text and vision appears to be scarce, given the recent advanced in the field of multimodal LLMs (Ruggeri & Nozza, 2023). This project aims to elicit stereotypical biases in multimodal LLMs through the VLStereoSet by Zhou et al. (2022), which consists of stereotypical and anti-stereotypical images, along with captions. I transform the dataset to fit an image caption matching task and intend to investigate how often a stereotypical caption is chosen under an anti-stereotypical image.
Datasets such as the one used in my study are generated with the help of templates. Using templates for bias elicitation does not account for the richness how such content can be phrased (Dev et al., 2022). To address this shortcoming, I intend to investigate different methods of paraphrasing, to gain insights as to whether elicited phenomena are robust under perturbations.
On Measures of Biases and Harms in NLP (Dev et al., Findings 2022)
A Multi-dimensional study on Bias in Vision-Language models (Ruggeri & Nozza, Findings 2023)
VLStereoSet: A Study of Stereotypical Bias in Pre-trained Vision-Language Models (Zhou et al., AACL-IJCNLP 2022)