Figure out why `test (256)` keeps failing
Closed this issue · 5 comments
Sometimes (only sometimes!) it gets stuck in an infinite loop of printing out address sanitizer errors. Very concerning, but only seems to happen for the 256-avx case so far. This one is currently not allowed to go into the python wheels due to a previous mysterious crash, so this isn't urgent, but it sure would be nice to know why these things are occurring.
Based on #718 this is a bug in gtest rather than a bug in stim. Reported it in google/googletest#4491 .
I have the same bug (AddressSanitizer:DEADLYSIGNAL
) in my primecount project in the code below which only uses the math functions from the C++ standard library (my project does not use googletest):
for (int i = 0; i < 100; i++)
{
T term = (Li(t) - x) * std::log(t);
// Not converging anymore
if (std::abs(term) >= std::abs(old_term))
break;
t -= term;
old_term = term;
}
My bug only occurs on Ubuntu 22.04 & 23.10 (x64) when running in a virtual machine and enabling the GCC/Clang sanitizers. When I switched my CI test to ubuntu-20.04 the bug disappeared. (When I tested using Ubuntu 22.04 & GCC sanitizers on a real server (no VM or Docker container) it also works without any issues)
After more than 2 hours of debugging I couldn't figure out the exact cause of the issue, but it looks like the issue is caused by a Ubuntu >= 22.04 bug or a compiler/sanitizer bug.
UPDATE 18/03/2024: Today I also tested on a Fedora 36 x64 VM using GCC and the same compiler options but I was not able to reproduce the issue. Hence the issue seems to only occur on Ubuntu x64 VMs (and possibly also on Debian VMs).
@kimwalisch Thanks, that's very helpful to know that I can work around it by pinning the version of ubuntu used by CI.
This seems to have been resolved externally. Hasn't happened in a PR for about a week now, whereas before it was happening multiple times per PR.