JOSS review: Automated tests
Closed this issue · 12 comments
@miniufo - well done on your submission to JOSS.
As it stands, your package doesn't meet the following requirement on the reviewer checklist:
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
(See openjournals/joss-reviews#5510 for details.)
You've got some test code available at tests/
and associated data files at Data/
, but there is no guidance provided on how to run those tests manually or better still there is no automated testing.
In addition, it looks like your test code culminates in the plotting of figures. While that can be useful for visual verification, common practice would also include a suite of tests that have the following components:
- a fixture, which is the thing being tested (e.g., an input data array);
- an actual result, which is what the xinvert package produces when given the fixture; and
- an expected result that the actual result is compared to.
The comparison between the actual result and expected result would take the form of an assertion, so that the test can be said to pass or fail. You can then use a test runner like pytest in conjunction with GitHub actions to setup automated testing.
You may already be familiar with these testing concepts, but if not let me know and I can point you to some resources that explain how to setup automated testing.
You are right, I know this concept, probably using assert
keyword in the program, but not every detail. Could you please send me some concrete examples that I can follow?
Since it is a numerical solver, I cannot expect the repeated runs of the codes reproduce the same results. I only visualize the plots to see if they are close to those plots in the related papers.
This chapter in Research Software Engineering with Python gives an overview of general testing concepts:
https://merely-useful.tech/py-rse/testing.html
This page gives more information on using pytest:
https://realpython.com/pytest-python-testing/
In terms of a concrete example, here's a test file I wrote recently:
https://github.com/climate-innovation-hub/qqscale/blob/master/test_qdm.py
While you can't expect repeated runs to reproduce exactly the same results, you should hopefully be able to create a fixture that produces an expected result within some tolerance (i.e. assert np.allclose(expected_result, actual_result)
).
Once you've got your tests setup, you can use GitHub Actions to run the tests every time you make a commit to your repository. That involves creating a tests.yml
file in a new .github/workflows/
directory. It will look something like:
https://github.com/climate-innovation-hub/qqscale/blob/master/.github/workflows/tests.yml
Hope that helps 😄
Thanks very much for your kind help. I will follow these to make auto-tests for xinvert soon.
Hi @DamienIrving, I just followed your guide and made a action for tests. Everything on pytest
works fine locally on my laptop. But the action build raised an error: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy']
I made both netcdf4
and scipy
in the requirement.txt
file, so I don't know how to resolve this. Could you help me find why this is happening?
PS: I use git lfs
this time because some of the sample netcdf files are larger than github limits.
If you add an extra step in tests.yml
after the "install dependencies" step to list the installed packages you'll see that netCDF4 is successfully installed, so that isn't the problem.
- name: List installed packages
shell: bash -l {0}
run: pip list
It might be that the files are too large... you could try a test with a very small data file?
Hi @DamienIrving, thanks for your tries and suggestion. I guess I found the reason. All netcdf files in Data
has been tracked by LFS, which are then not recognized by xr.open_dataset
. I modified a small nc file but leave it outside of LFS, then it is passed (you may check the log).
So this is really embarrassing, because I need some global dataset to make a test, which exceeds the limit by github. LFS seems to upload big file transparently but not openable online. Do you have any suggestion?
It sounds like you need to refactor the tests you want GitHub Actions to run so they don't require large input datasets.
For automated testing, it's most common to either have the test code create it's own fake data (e.g. https://github.com/climate-innovation-hub/qqscale/blob/master/test_qdm.py#L15) so that you don't need to include any data files in your repo, or if using "real" data is unavoidable then it's common to use a netCDF file containing only a single grid point or small spatial area.
You can do your own testing offline where you process and plot large global datasets, but those offline tests wouldn't be included in the tests run by GitHub Actions.
Does that make sense?
Well, that's right. I should skip those tests requiring large datasets (>100MB), and do them locally by hands.
Hi @DamienIrving, all the pytests has been passed. So please see if this issue is fixed now. Thank you very much for you kind help with this.
Thanks for implementing automated testing.
Just so I understand, what are the tests in test_Poisson.py checking?
I would expect that each test would have three basic steps:
- Read in vorticity and divergence data from a file (that's the test "fixture")
- Calculate the streamfunction and velocity potential (that's the "actual result")
- Compare the actual result to an "expected result" to see if the test passes (i.e. using an assertion)
Do you have an expected result? If not, is there some vorticity and divergence data that you can create (i.e. simple example data) where you know what the streamfunction and velocity potential values should be?
Hi, @DamienIrving thanks for your careful check on this. You are right that one could find some analytical results to do the tests. But here I adopted an indirect way here. Basically, we are not checking the streamfunction itself, but its gradient (or velocity components).
According to Helmholtz decomposition theory, a vector field, say,
That is why I put assert
s to see if the divergence of
One of the potential problem with analytical expression is that, we use finite difference scheme here. If we use larger grid size, there would be grid-scale truncation error relative to analytical results, which may be above the tolerance. But the current asserts do not depend on grid size, because the inversion for streamfunction, as well we the calculation of its gradient or velocity components, all use the consistent finite difference scheme, which turns out to be quite good that the expected
Also, there will be an arbitary constant to Poisson solution. So any streamfunction, after adding a constant, is still a solution to Poisson equation. I guess this could be a problem with windspharm
to determine the arbitary constant. That is why I do really prefer the indirect way of assertion here, as it avoid the arbitary constant.
Thanks for the explanation - that's a nice way to write the tests!