test_get_corsika_telescope_list takes too long
Closed this issue · 8 comments
The test_get_corsika_telescope_list
test takes roughly 20 seconds to run. Profiling it points to excessive time spent initializing the array model, perhaps related to reading from the DB too many times.
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 21.427 21.427 /workdir/external/simtools/simtools/model/array_model.py:41(__init__)
1 0.000 0.000 20.855 20.855 /workdir/external/simtools/simtools/model/array_model.py:69(_initialize)
60 0.004 0.000 20.773 0.346 /workdir/external/simtools/simtools/model/model_parameter.py:49(__init__)
58 0.002 0.000 19.823 0.342 /workdir/external/simtools/simtools/model/telescope_model.py:38(__init__)
1 0.000 0.000 0.994 0.994 /workdir/external/simtools/simtools/model/site_model.py:27(__init__)
61 0.001 0.000 0.574 0.009 /workdir/external/simtools/simtools/db/db_handler.py:57(__init__)
Need to look into it further (in the middle of something else, opening this issue so it is not forgotten).
The array_model
consist of the site model, the model of all telescopes and of all calibration devices. So this is naturally that all DB calls are initiated from the array_model_class.
The number of calls to the DB are reduced by having the DatabaseHandler.*cached
dicts, where the idea is to not query the same parameter twice.
Obviously always good to check that this is still working.
The main difference to the old style is that now we call the DB for each telescope separately. In the past we read all of the parameters of all of the telescopes in one query (which is very quick). I suggest to extend the caching functionality to do that as well with the new style. Usually the model parameter version does not change from one telescope to the next (at least for now), so when the first telescope is read, we can fill the cache with the parameters of all telescopes for that same version. The should reduce the DB calls to one in most cases.
I can confirm that when running at DESY this test takes 1 second, while running it at home takes 15-20 seconds.
On GitHub this test is also the slowest at ~50 seconds (see e.g., this job). I think that adding another cache level is therefore worth it since it will shorten the running time of the GitHub tests. It's far from urgent though, considering it is just a minute in the worst case scenario.
Started to have a look:
test_get_corsika_telescope_list
is the only test for corsika_config.py which actually accesses the database (all other tests are using the fixturecorsika_config_no_db
; I actually introduce at some point a@pytest.mark.uses_model_database()
marker)- so no wonder that this is the slowest function in these tests...
Given that we want to test the code and not the database connection, my suggestion is to change also this function and either use the corsika_config_no_db
or some mocking (aligned with the suggestion in issue #1054). OK?
Additionally:
- we should think about dedicated benchmark tests on accessing the DB for different usage scenarios
- suggest that this would be an integration test and not a unit test, as it tests several modules
I find the comments/suggestions a little bit contradictory. I will write comments below. They are out of order, but I think follow some logic still.
- suggest that this would be an integration test and not a unit test, as it tests several modules
If we want to change it to be an integration test, then I wouldn't use mocking but actually connect to the DB as expected in integration tests and as we do in others.
- we should think about dedicated benchmark tests on accessing the DB for different usage scenarios
This test actually accomplished that in a way. It showed that we are accessing the DB for each telescope separately, which is something we can probably easily avoid. I agree that having a dedicated test for accessing the DB is good though, let's try to come up with it after we decide on the structure of the DB.
Given that we want to test the code and not the database connection, my suggestion is to change also this function and either use the corsika_config_no_db or some mocking (aligned with the suggestion in issue #1054). OK?
It's fine to change the unit test to use mocking, but then I would add a separate integration test that does the same, actually using the DB. I think it's important to test this in a real scenario, considering the chain of "events" this test triggers (even though it is just one command).
Oh and I would wait before implementing any changes until we implement the DB restructuring. This problem might resolve itself then because we will probably access the DB once, getting the entries based on the table of production version/parameter versions.
Yes and no:
- independent of any restructuring should we avoid DB calls in unit tests
- we need a test for the DB connections (but not a unit test)
I think both points are not affected by restructuring.
Yes, sorry, I didn't mean that those points would not be valid anymore, just that this specific issue of many unnecessary calls would (hopefully) be resolved. I don't think we should add another cache at the moment (like I proposed above) because it will probably be superfluous once we implement the restructuring.