/URS

URS Benchmark: Evaluating LLMs on User Reported Scenarios

Primary LanguagePython

Stargazers