A curated collection of challenging statements on sensitive topics for LLM benchmarking. Designed to distinguish LLMs' abilities from their stochastic nature.
No issues in this repository yet.