/QARV

Primary LanguageJupyter NotebookMIT LicenseMIT

QARV (Question and Answers with Regional Variance)

EleutherAI Community Project

The QARV (Question and Answers with Regional Variance) project aims to curate a collection of questions with answers that exhibit regional variations across different nations.

Goals

  1. Curating a dataset of questions and answers exhibiting regional variations across nations. (Coverage to be discussed.)
  2. Observing biases in language models regarding regional contexts.
  3. Exploring whether ICL, prompting, SFT, or RLHF can be used to steer language models towards generating culturally-aware responses.
  4. whether merged models also solve questions that are answerable only for the monolingual mode (by atsu)

Progress

  • 2024.04.04 We now collected 1k questions and is starting to annotate answers.
  • 2024.03.22 The first slice of the QARV dataset is now available.

How to reach out

If you are interested in joining this project contact us in the #multilingual channel of the EAI discord.

References

  1. Multilingual Language Models are not Multicultural: A Case Study in Emotion

  2. KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge

  3. Having Beer after Prayer? Measuring Cultural Bias in Large Language Models