/bug_in_the_code_stack

A new benchmark for measuring LLM's capability to detect bugs in large codebase.

Primary LanguageJupyter Notebook

Bug In The Code Stack Repository