MartaMarchiori's Stars
daviddao/awful-ai
😈Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness
parrt/dtreeviz
A python library for decision tree visualization and model interpretation.
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in HEIM (https://arxiv.org/abs/2311.04287) and vision-language models in VHELM (https://arxiv.org/abs/2410.07112).
zhijing-jin/nlp-phd-global-equality
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP
interpretml/interpret-community
Interpret Community extends Interpret repository with additional interpretability techniques and utility functions to handle real-world datasets and workflows.
microsoft/adaptive-testing
Find and fix bugs in natural language machine learning models using adaptive testing.
i-gallegos/Fair-LLM-Benchmark
hate-alert/DE-LIMIT
DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.
tongshuangwu/polyjuice
ModelOriented/fairmodels
Flexible tool for bias detection, visualization, and mitigation
paul-rottger/hatecheck-data
Röttger et al. (ACL 2021): "HateCheck: Functional Tests for Hate Speech Detection Models" - Data
slanglab/twitteraae
Code for Blodgett et al. 2016, Demographic dialectal variation in social media
marti5ini/GENCDA
Boosting Synthetic Data Generation with Effective Nonlinear Causal Discovery
msetzu/glocalx
Generating global explanations from local ones
dhfbk/twitter-abusive-context-dataset
sakshiudeshi/Astraea
Code for "Astraea: Grammar-based Fairness Testing"
fani-lab/Adila
Fairness-Aware Team Formation
spaidataiga/RedditPoliticalBias
The data and code used in the production of the paper: Quantifying Gender Biases Towards Politicians on Reddit
arb7/adv-deb-multi
Adversarial debiasing implementation compatible with generic pandas DataFrames and multiclass classification.
dlab-projects/hate_measure
Measuring hate speech
dlab-projects/hate_target
Paper repository for "Targeted Identity Group Prediction in Hate Speech Corpora" by Sachdeva et al.
gesiscss/socialCAD
rmassidda/bisturi
Framework for neural models inspection.
aequa-tech/debunker-assistant
AequaTech/DebunkerAssistant
kinit-sk/bias-methodology
ART-Group-it/HateSpeechKermit
KERM-HATE push forward the research on how syntactic information can be used to de-bias hate speech recognizers and so to contribute to solve problems of prejudice.
mawic/graph-based-method-annotator-bias
msetzu/fairbelief
Interpreting masked language models beliefs and evaluating their fairness
sblbl/treemob_viz
Visualisation of the mobility tree of a user