issue under mac?
gf777 opened this issue · 6 comments
Hi @greg7mdp
You helped me a few months ago implementing parallel joining of maps:
#234
This was working fine, but recently I started having trouble under Mac. As a matter of fact the process seems to be working fine, but I get some warning and test failures so I am wondering if this could be highlighting something deeper.
Basically on my mac (Sonoma 14.2) I pass the test but in github actions it fails: https://github.com/vgl-hub/kreeq/actions/runs/9065660799/job/24906743274#step:4:62
If I revert to macos-13 in github action it passes the test. Locally I get this warning when the test is run though:
gformenti@macbook-pro-25:~/Documents/GitHub/kreeq$ kreeq union -d testFiles/test1.kreeq testFiles/test2.kreeq
submap count(0) != N(8)
submap count(0) != N(8)
DBG Summary statistics:
Total kmers: 1572
Unique kmers: 13
Distinct kmers: 115
Missing kmers: 4398046510989
Total edges: 196
Do you have any clue as what might be causing this? Maybe it is just a bug in my implementation, which is here: https://github.com/vgl-hub/kreeq/blob/int32/src/graph-builder.cpp#L860
Many thanks and all the best
Giulio
Hi @gf777 , I'm away from home and I can't have a good look at your code right now. I believe there should not be any problem accessing map1
and map2
like you do. I wonder about map32
though.
Just from a quick look, this looks strange, shouldn't it be (*maps32)[m]
?
I'll look at your code in more detail when I get home.
Hi Gregory
Thank you so much for looking into this!
So, initially I also thought it had to do with new commits, particularly the new map32
that you are referring to (I spent quite some time trying to debug it!). For context, map32 is a larger version of the standard map used to store counts that exceed 255. However, I realized that the issue was affecting the main branch as well, which wasn't updated in the past two months. When I ran a new check there it started to give me the same issue, suggesting me that a change in the macos version used for github actions was the culprit. See https://github.com/vgl-hub/kreeq/commits/main/ commits 16c61db to 1b3369d
This seems to overlap with github actions transitioning from macos 12 to 14 in the last two months: actions/runner-images#9255
ps. dereferencing the individual map in the maps vector should be ok because this is how the vector is defined https://github.com/vgl-hub/kreeq/blob/int32/include/kreeq.h#L68
ps. dereferencing the individual map in the maps vector should be ok because this is how the vector is defined https://github.com/vgl-hub/kreeq/blob/int32/include/kreeq.h#L68
Makes sense. I'll do a more careful code review when I'm back from vacation.
Don't you access in read/write the same parallelMap32
(maps32[m]
) from multiple threads without a mutex protection?
Hi @greg7mdp
That shouldn't be the case. Do you have a particular line in mind?
The logic is the following:
I create a vector of m
maps (maps32
). Then I have a job scheduler that runs m
jobs (potentially concurrently, one per thread). So maps are always updated by different jobs.
Afaik counts are always correct (so there appears to be no data race). This is true even with the submap count(0) != N(8)
warning under MacOS. The warning is only under Mac 14 (Mac13, linux and win are fine).