Test organization improvements
Opened this issue · 2 comments
Part of #2238
While working on my first PR to the repo, I've noticed that interacting with the test cases is not easy.
This is because of how much is in each file, with the longest being ~1200 lines long. Because it's a test that uses # output
, the input and output for a specific case are ~600 lines apart. This is made worse by the fact that formatting changes line counts, so getting back to the source isn't as easy as dividing the line number by 2. Running the test case also doesn't help, as it only reports the end failure, and nothing about the original input.
With the current framework, the way I can see to solve this would be splitting cases into their own files, inside folders. For example, instead of one big preview_long_strings__regression.py
file, there would be a preview_long_strings__regression
folder with contents named preview_long_strings__regression_1.py
, preview_long_strings__regression_2.py
, etc.
This would be really nice for making all the tests more local, as if a set of tests fails, each one would have it's input directly next to it's output, and each test case would be individually re-runnable with pytest.
I'm not sure how this would affect the performance, since I don't know the cost of running a bunch of tiny files vs one large file.
I'm not sure if the current test runner supports running cases inside folders, but given the current structure I assume so.
If for whatever reason a test ever needed to be removed, this would make it a different type of difficult, as instead of having to find the input in a long file, now all test files after the one being removed would have to be renamed.
If this is a thing that is wanted, I'd be happy to start work on the PR.
Splitting up large files would be good. We shouldn't use pure numbering, because renumbering files would be needlessly annoying. Ideally the names would reflect what each file is testing. No need for subdirectories.
We could also consider an alternative structure, e.g. with section within larger test files (somewhat like the mypy test suite: https://github.com/python/mypy/blob/master/test-data/unit/check-abstract.test ). I don't know if that's worth it.
The main reason I'd go for numbers is it would be the least work, at least up front. From what I saw looking through the preview string tests, they are currently very haphazard. If a piece of code ever caused an issue, it looks to have been mostly put into the test file as-is, without worrying about reducing it to an MVP. That makes giving each part of a test a meaningful name hard, since whatever the test is testing for has been long fixed, the only way to get context on why a test exists is to go through the commit history to the original PR that added it.
As for sections within a file, I could see that, but it could also be a lot more work, since I have no clue what's going on with the testing code. Obviously something custom with the # output
s, but I'd have to look into it more. It would solve the locality and naming issues though.