/gpqa-eval

gpqa evals for blog post on statistical power in llm evals

Primary LanguageJupyter Notebook

Watchers

No one’s watching this repository yet.