princeton-nlp/SWE-bench

[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?

PythonMIT

Issues

nonexistent PASS_TO_PASS test in dataset for astropy__astropy-7606
#223 opened 2 months ago by kjslag
5
Evaluation hangs on "Building environment images"
#245 opened 2 months ago by thetonywu
0
PR of `sympy__sympy-17655` did not resolve the problem
#244 opened 2 months ago by SmartManoj
0
Jinja2 is not pinned for sphinx
#241 opened 2 months ago by SmartManoj
2
Want apptainer support
#242 opened 2 months ago by HotineXie
1
Errors building Matplotlib env instances
#239 opened 2 months ago by martinbel
3
All Sphinx tests are incorrectly reported as failing
#228 opened 3 months ago by eschluntz
4
Want a docker image tar file
#211 opened 4 months ago by WentaoTan
4
Issues with data collection - supported?
#188 opened 2 months ago by kwanUm
3
astropy__astropy-14182 - `test_patch` includes Unnecessary `read` Method Modifications
#222 opened 2 months ago by SmartManoj
1
docker evaluation gets stuck
#157 opened 2 months ago by crhf
5
Excessive memory usage in conda 23.11.0
#231 opened 2 months ago by beornf
1
Questions about RAG baselines
#230 opened 2 months ago by dlibk
1
Dependencies version in constants.py
#229 opened 2 months ago by SZU-ZJW
1
Running Evaluation on Runpod.
#220 opened 2 months ago by saadan1234
1
allow for subscription to top of leaderboard change on the website (email).
#227 opened 2 months ago by RealmX1
1
Incorrect Issue Description for Instance 'astropy__astropy-14182'
#219 opened 2 months ago by SmartManoj
3
Clarification on Identifying fail2pass and pass2pass Test Cases for an Instance
#195 opened 2 months ago by gnohgnailoug
2
Error in `astropy__astropy-14539` when using numpy==1.25.2
#225 opened 2 months ago by SmartManoj
3
failure to build env image for astropy__astropy-7606
#224 opened 2 months ago by kjslag
3
Yanked package `types-pkg-resources` causes failures when evaluating on `sqlfluff`
#199 opened 3 months ago by klieret
8
matplotlib__matplotlib-23476 failed at pre-install
#210 opened 3 months ago by HejiaZ2023
1
scikit-learn__scikit-learn images built error
#218 opened 3 months ago by JiyangZhang
2
ValueError: Could not find requirements.txt at paths ['tests/requirements/py3.txt'] for repo django/django
#198 opened 5 months ago by hyyp1
1
The expected result of post-installation testing output.
#192 opened 3 months ago by sh0416
2
Issue in building environment for SWE-Bench train instances
#185 opened 3 months ago by jatinganhotra
2
Test for human falsehoods
#208 opened 3 months ago by MovGP0
7
How to inference without docker?
#194 opened 4 months ago by WentaoTan
3
404 for tutorial links in the README
#209 opened 4 months ago by fhfonsecaa
1
`UnicodeDecodeError` when running gold patch for `django__django-14011` in the dockerized harness
#215 opened 4 months ago by blahblahasdf
0
Compatibility Issue with Updated Pandas Version in Xarray
#187 opened 4 months ago by SmartManoj
1
base_commit & patch how to use?
#201 opened 4 months ago by xlisp
4
image build fail in issue pallets__flask-4045
#196 opened 4 months ago by tangken333
0
Is it fair to inform the final test command to the Agent?
#203 opened 4 months ago by bin123apple
2
Benchmarks and leaderboards published on your website are out of date. Please update them.
#191 opened 5 months ago by Emasoft
1
Failing benchmark instances
#167 opened 6 months ago by aorwall
6
It seems that current evaluation does not handle the apply failure case?
#154 opened 5 months ago by Hodge931
4
`exec_run_with_timeout` does not actually kill long-running thread
#160 opened 5 months ago by klieret
1
Why SWE-Bench Train does not contains data of "test_patch"? I could not find them.
#182 opened 5 months ago by BoxiYu
1
Where can I find training set to train swe-llama?
#180 opened 5 months ago by Hodge931
2
Query about how to trace the github issue corresponding to the data
#179 opened 5 months ago by tangken333
2
Why SWE-Bench Train does not contains data of "test_patch"
#181 opened 5 months ago by BoxiYu
0
Confused about the usage of fields `test_patch`, `PASS_TO_PASS` and `FAIL_TO_PASS`
#174 opened 6 months ago by DavdGao
2
Which Python version to use?
#156 opened 6 months ago by anupamme
3
matplotlib__matplotlib-18869 can't pass (?) due to test_tmpconfigdir_warning
#172 opened 6 months ago by waterson
1
sphinx-doc instances create an unecessary tox virutalenv during eval
#170 opened 6 months ago by waterson
0
Missing `validation.ipynb`?
#163 opened 6 months ago by xingyaoww
1
Passed test case count as failure?
#165 opened 6 months ago by xingyaoww
0
Cannot load dataset from JSON file
#150 opened 6 months ago by klieret
0
swe-bench can get badly stuck in `future.result()`
#158 opened 6 months ago by klieret
2