Awesome Cultural NLP:
A curated list of awesome cultural NLP resources, inspired by awesome-computer-vision .
Table Of Contents
Title
Conference / Journal
Paper
Code
Remarks
Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art
Arxiv 2024
2406.03930
Towards Measuring and Modeling “Culture” in LLMs: A Survey
Arxiv 2024
2403.15412
Github
Cool paper!
Challenges and Strategies in Cross-Cultural NLP
ACL 2022
2203.10020
Title
Conference / Journal
Paper
Code
Remarks
CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies
Arxiv 2024
2404.15238
NORMAD: A Benchmark for Measuring the Cultural Adaptability of Large Language Models
Arxiv 2024
2404.12464
Data
Data
An image speaks a thousand words, but can everyone listen? On image transcreation for cultural relevance
Arxiv 2024
2404.01247
Code and Data
Data + Application
No Culture Left Behind: Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking on 1000+ Sub-Country Regions and 2000+ Ethnolinguistic Groups
Arxiv 2024
2402.09369v1
Data
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
Arxiv 2024 (under review)
2404.16019
Repository
Code and Data
Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis
NAACL 2024
2308.16705
Data+Code
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence
LREC-COLING '24
https://arxiv.org/pdf/2403.06412
Data
Bridging Cultural Nuances in Dialogue Agents through Cultural Value Surveys
EACL Findings 2024
2401.10352
Dataset
Culturally Aware Natural Language Inference
EMNLP 2023 (Findings)
2023.findings-emnlp.509
Data
Global Voices, Local Biases: Socio-Cultural Prejudices across Languages
EMNLP 2023
2310.17586
Data
Data+Analysis
NORMSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
EMNLP 2023
2210.08604
Code and Data
NormsKB
GeoDE: a Geographically Diverse Evaluation Dataset for Object Recognition
Neurips 2023
2301.02560
Code and Data
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models
ACL 2023
2305.11840
Code
FORK: A Bite-Sized Test Set for Probing Culinary Cultural Biases in Commonsense Reasoning Models
ACL Findings 2023
2023.findings-acl.631
Dataset
Multi-lingual and Multi-cultural Figurative Language Understanding
ACL Findings 2023
2305.16171
Code
EnCBP: A New Benchmark Dataset for Finer-Grained Cultural Background Prediction in English
ACL Findings 2022
2203.14498
Re-contextualizing Fairness in NLP: The Case of India
AACL 2022
2209.12226
Data
Data+Analysis
Visually Grounded Reasoning across Languages and Cultures
EMNLP 2021
2109.13238
Website
EMNLP 2021 Best Paper
Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences
ACL 2020
2020.acl-main.477/
Title
Conference / Journal
Paper
Code
Remarks
CIC: A framework for Culturally-aware Image Captioning
IJCAI 2024
2402.05374
Webpage
Title
Conference / Journal
Paper
Code
Remarks
GIVL: Improving Geographical Inclusivity of Vision-Language Models With Pre-Training Methods
CVPR 2023
2301.01893
Code (not released yet)
Title
Conference / Journal
Paper
Code
Remarks
Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting
Arxiv 2024
2406.11661
Extrinsic Evaluation of Cultural Competence in Large Language Models
Arxiv 2024
2406.11565
CulturalTeaming: AI-Assisted Interactive Red-Teaming for Challenging LLMs’ (Lack of) Multicultural Knowledge
Arxiv 2024
2404.06664
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
ACL 2024
2305.14456
Code
Title
Conference / Journal
Paper
Code
Remarks
The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention
Arxiv 2024
2407.00377v1
On the Cultural Gap in Text-to-Image Generation
Arxiv 2023
2307.02971
Code
Title
Conference / Journal
Paper
Code
Remarks
From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models
Arxiv 2024
2407.00263
Title
Conference / Journal
Paper
Code
Remarks
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation
ACL 2024
2401.06310
DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity
ICLR 2024
2308.06198
Code
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
JAIR 2023
2209.08891
Code
Navigating Cultural Chasms: Exploring and Unlocking the Cultural POV of Text-To-Image Models
Arxiv 2023
2310.01929
Code (not released yet)
Inspecting the Geographical Representativeness of Images from Text-to-Image Models
ICCV 2023
2305.11080
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
FAccT '23
2211.03759
Multilingual Conceptual Coverage in Text-to-Image Models
ACL 2023
2306.01735
Code
Title
Conference / Journal
Paper
Code
Remarks
Exploring Changes in Nation Perception with Nationality-Assigned
Personas in LLMs
Arxiv 2024
2406.13993
CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting
Arxiv 2024
2404.10199v1
Code
Knowledge of cultural moral norms in large language models
ACL 2023
2306.01857
Multilingual Language Models are not Multicultural: A Case Study in Emotion
WASSA: ACL 2023
2307.01370
Social Commonsense for Explanation and Cultural Bias Discovery
DOSA: A Dataset of Social Artifacts from Different Indian Geographical Subcultures
LREC-COLING 2024
2403.14651
Code
Title
Conference / Journal
Paper
Code
Remarks
Multilingual Diversity Improves Vision-Language Representations
Arxiv 2024
2405.16915
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision–Language Models
Arxiv 2024
2405.13777
Computer Vision Datasets and Models Exhibit Cultural and Linguistic Diversity in Perception
Arxiv 2024
2310.14356
Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing
arxiv 2024
2402.06015
‘Person’ == Light-skinned, Western Man, and Sexualization of Women of Color: Stereotypes in Stable Diffusion
EMNLP 2023 Findings
2310.19981
Cross-cultural Variations
Title
Conference / Journal
Paper
Code
Remarks
Cross-Cultural Analysis of Human Values, Morals, and Biases in Folk Tales
EMNLP 2023
2023.emnlp-main.311
Social Commonsense for Explanation and Cultural Bias Discovery
EACL 2023
2023.eacl-main.271.pdf
Cross-cultural variation of speech-accompanying gesture: A review
Language and Cognitive Processes: Volume 24, Issue 2, 2009
10.1080/01690960802586188
Title
Conference / Journal
Paper
Code
Remarks
Investigating Cultural Alignment of Large Language Models
Arxiv 2024
2402.13231
Unintended Impacts of LLM Alignment on Global Representation
Arxiv 2024
2402.15018
Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study
C3NLP: EACL 2023
2303.17466
Analysis
Probing Pre-Trained Language Models for Cross-Cultural Differences in Values
C3NLP: EACL 2023
2203.13722
Analysis
Title
Conference / Journal
Paper
Code
Remarks
NLPositionality: Characterizing Design Biases of Datasets and Models
ACL 2023 (Outstanding Paper)
2023.acl-long.505.pdf
Website
Title
Conference / Journal
Paper
Code
Remarks
Cultural Concept Adaptation on Multimodal Reasoning
EMNLP 2023
EMNLP Main 18
Title
Conference / Journal
Paper
Code
Remarks
Cross-Cultural Similarity Features for Cross-Lingual Transfer Learning of Pragmatically Motivated Tasks
EACL 2021
2006.09336
Sentiment Analysis
Please feel free to send me pull requests or email (khanuja.simran7@gmail.com ) to add links.
License
To the extent possible under law, Simran Khanuja has waived all copyright and related or neighboring rights to this work.