astropy/halotools

Tests are shaky in parallel invocation

Opened this issue · 2 comments

Hi,

During preparing Halotools for Guix package index I've noticed that tests randomly fail
when --numprocess is provided to activate pytest-xdist module for parallel jobs, it's not happaning
with single job.

Round 1
FAILED halotools/sim_manager/tests/test_halo_table_cache.py::TestHaloTableCache::test_determine_log_entry_from_fname
FAILED halotools/sim_manager/tests/test_ptcl_table_cache_log_entry.py::TestPtclTableCacheLogEntry::test_scenario4
FAILED halotools/sim_manager/tests/test_ptcl_table_cache.py::TestPtclTableCache::test_add_entry_to_cache_log1
FAILED halotools/sim_manager/tests/test_halo_table_cache.py::TestHaloTableCache::test_remove_entry_from_cache_log
FAILED halotools/sim_manager/tests/test_ptcl_table_cache.py::TestPtclTableCache::test_add_entry_to_cache_log3
FAILED halotools/sim_manager/tests/test_ptcl_table_cache.py::TestPtclTableCache::test_determine_log_entry_from_fname1
FAILED halotools/sim_manager/tests/test_halo_table_cache.py::TestHaloTableCache::test_add_entry_to_cache_log
FAILED halotools/sim_manager/tests/test_halo_table_cache.py::TestHaloTableCache::test_update_cached_file_location
FAILED halotools/sim_manager/tests/test_ptcl_table_cache.py::TestPtclTableCache::test_determine_log_entry_from_fname2
FAILED halotools/sim_manager/tests/test_ptcl_table_cache.py::TestPtclTableCache::test_determine_log_entry_from_fname3
FAILED halotools/sim_manager/tests/test_user_supplied_ptcl_catalog.py::TestUserSuppliedPtclCatalog::test_add_ptclcat_to_cache4
FAILED halotools/sim_manager/tests/test_ptcl_table_cache_log_entry.py::TestPtclTableCacheLogEntry::test_passing_scenario

Round 2
FAILED halotools/sim_manager/tests/test_ptcl_table_cache_log_entry.py::TestPtclTableCacheLogEntry::test_scenario2a
FAILED halotools/sim_manager/tests/test_ptcl_table_cache_log_entry.py::TestPtclTableCacheLogEntry::test_scenario2c
FAILED halotools/sim_manager/tests/test_user_supplied_ptcl_catalog.py::TestUserSuppliedPtclCatalog::test_add_ptclcat_to_cache6
FAILED halotools/sim_manager/tests/test_user_supplied_halo_catalog.py::TestUserSuppliedHaloCatalog::test_add_halocat_to_cache1

Inputs:

  • python-halotools@0.9.1
  • python-cython-next@3.0.8
  • python-extension-helpers@1.1.1
  • python-pytest@7.1.3
  • python-pytest-astropy@0.11.0
  • python-setuptools@67.6.1
  • python-setuptools-scm@7.1.0
  • python-wheel@0.40.0
  • python-astropy@6.1.4
  • python-beautifulsoup4@4.11.1
  • python-cython@0.29.32
  • python-h5py@3.8.0
  • python-numpy@1.23.2
  • python-requests@2.28.1
  • python-scipy@1.12.0

Thanks for reporting this issue. I've never ran the halotools test suite in parallel and so I have not noticed this before. All of the failing tests you show appear to do with the (ad hoc) system the library uses to store and create a persistent memory of simulation data. In the test suite, the code creates some tiny simulation data, creates a log entry of the fake sims, and then runs tests on the logging mechanisms. Errors in these tests being run in parallel makes me think that some threads may be running tests on fake simdata that has not been created yet, or something like that. This would be harmless in terms of the performance of the source code, although I realize that's annoying for purposes of parallel testing. Do you have a workaround?

Hi,

Thank you for detailed replay.

I did not go too far with investigation of possible solutions yet, prepared it
without pytest-xdist enabled. From my experience with some related packages
(astropy, asdf) they have quite thread save tests suites, which benefits in CI.

The potential solution would be consolidate create/test pairs as a single unit
test.

Thanks,
Oleg