Bug: test suite failure in MACS 3.0.0
tillea opened this issue · 18 comments
Describe the bug
When trying to update the Debian package for MACS I get some test suite errors in test step 16.18
.
...
16.17.1 checking callvar PEsample.vcf ...
... success!
16.18.1 checking hmmratac hmmratac_yeast500k_accessible_regions.gappedPeak ...
... failed! Difference:
1,283c1,919
< chrI 0 700 peak_1 0 . 30 560 0 1 530 30 0 0 0
< chrI 113420 114870 peak_5 0 . 113440 114700 0 1 1260 20 0 0 0
< chrI 140480 142080 peak_6 0 . 140760 141740 0 1 980 280 0 0 0
< chrI 165050 165900 peak_7 0 . 165330 165830 0 1 500 280 0 0 0
< chrI 181160 181650 peak_8 0 . 181200 181610 0 1 410 40 0 0 0
< chrI 198460 199140 peak_9 0 . 198660 199090 0 1 430 200 0 0 0
< chrI 60130 62600 peak_2 0 . 60410 62150 0 1 1740 280 0 0 0
< chrI 67960 69780 peak_3 0 . 68160 69260 0 1 1100 200 0 0 0
< chrI 71790 73230 peak_4 0 . 72120 73000 0 1 880 330 0 0 0
16.18.2 checking hmmratac hmmratac_yeast500k_bedpe_accessible_regions.gappedPeak ...
... failed! Difference:
1,283c1,919
< chrI 0 700 peak_1 0 . 30 560 0 1 530 30 0 0 0
< chrI 113420 114870 peak_5 0 . 113440 114700 0 1 1260 20 0 0 0
< chrI 140480 142080 peak_6 0 . 140760 141740 0 1 980 280 0 0 0
< chrI 165050 165900 peak_7 0 . 165330 165830 0 1 500 280 0 0 0
< chrI 181160 181650 peak_8 0 . 181200 181610 0 1 410 40 0 0 0
< chrI 198460 199140 peak_9 0 . 198660 199090 0 1 430 200 0 0 0
< chrI 60130 62600 peak_2 0 . 60410 62150 0 1 1740 280 0 0 0
< chrI 67960 69780 peak_3 0 . 68160 69260 0 1 1100 200 0 0 0
< chrI 71790 73230 peak_4 0 . 72120 73000 0 1 880 330 0 0 0
16.18.3 checking hmmratac hmmratac_yeast500k_load_hmm_model_accessible_regions.gappedPeak ...
... failed! Difference:
1,283c1,919
< chrI 0 700 peak_1 0 . 30 560 0 1 530 30 0 0 0
< chrI 113420 114870 peak_5 0 . 113440 114700 0 1 1260 20 0 0 0
< chrI 140480 142080 peak_6 0 . 140760 141740 0 1 980 280 0 0 0
< chrI 165050 165900 peak_7 0 . 165330 165830 0 1 500 280 0 0 0
< chrI 181160 181650 peak_8 0 . 181200 181610 0 1 410 40 0 0 0
< chrI 198460 199140 peak_9 0 . 198660 199090 0 1 430 200 0 0 0
< chrI 60130 62600 peak_2 0 . 60410 62150 0 1 1740 280 0 0 0
< chrI 67960 69780 peak_3 0 . 68160 69260 0 1 1100 200 0 0 0
< chrI 71790 73230 peak_4 0 . 72120 73000 0 1 880 330 0 0 0
16.18.4 checking hmmratac hmmratac_yeast500k_load_training_regions_accessible_regions.gappedPeak ...
... failed! Difference:
1,283c1,919
< chrI 0 700 peak_1 0 . 30 560 0 1 530 30 0 0 0
< chrI 113420 114870 peak_5 0 . 113440 114700 0 1 1260 20 0 0 0
< chrI 140480 142080 peak_6 0 . 140760 141740 0 1 980 280 0 0 0
< chrI 165050 165900 peak_7 0 . 165330 165830 0 1 500 280 0 0 0
< chrI 181160 181650 peak_8 0 . 181200 181610 0 1 410 40 0 0 0
< chrI 198460 199140 peak_9 0 . 198660 199090 0 1 430 200 0 0 0
< chrI 60130 62600 peak_2 0 . 60410 62150 0 1 1740 280 0 0 0
< chrI 67960 69780 peak_3 0 . 68160 69260 0 1 1100 200 0 0 0
< chrI 71790 73230 peak_4 0 . 72120 73000 0 1 880 330 0 0 0
To Reproduce
You might like to run
cd test
./cmdlinetest macs3.0.0-1-3.12
(same with macs3.0.0-1-3.11
which is macs3.0.0 built against Python3.11) on a Debian unstable system (container)
Expected behavior
I confirm
python3.12 -m pytest test
(or python3.11
) is working but the cmdlinetest
shows these failures. I wonder if just some data need to be justified due to some rounding errors?
System (please complete the following information):
- OS: Debian unstable
- Python version: 3.11 or 3.12
- Numpy version: 1.24.2
- MACS Version: 3.0.0
Hi @tillea, We had a similar issue due to the Numpy version. The same code of MACS will generate slightly different results in Numpy with version < 1.25 and version >=1.25. The current standard output for testing (cmdlinetest
) is from Numpy 1.25. Did you see such an issue in other Python-based tools that depend on Numpy?
@tillea we tested using Github Action: https://github.com/macs3-project/MACS/actions/runs/7732237695/job/21083012078 for python3.11 and numpy 1.24.2, and the test passed. It may also relate to other dependencies as well. I think the only way to solve this is to relax the test -- we relaxed the precision of tests before. But this time, the differences in the result are at the peak coordinates -- usually 1bp difference. It's trivial to go through each peak coordination and allow a 1bp difference. Perhaps we can use other criteria to test that the results are 'similar' to the standard. Let me think...