Re-run with Gemini's safety settings off
Closed this issue · 0 comments
neubig commented
Gemini has safety filters on by default, but this may hurt downstream accuracy. We should try to re-run the gemini filters with safety settings off (reference: BerriAI/litellm#1190).
We should:
- Re-run the Gemini evals for tasks with safety settings off (we will name this model
gemini-pro
and the model with safety settings ongemini-pro-filtered
) - Update the numbers, figures, and discussion in each task section of the paper with the new
gemini-pro
numbers - Where appropriate (maybe in MMLU and Translation?) have a limited discussion of the effect of safety filtering
- Update the Zeno report to match the paper content
Here is a checklist for the tasks, check off each task when this is done please!
- Knowledge-based QA
- Reasoning
- Mathematics
- Code Generation
- Translation
- Web Instruction Following