[regression] 0.3.29 build/tests fail on Sandy Bridge x86_64 machine
Closed this issue · 2 comments
Building/running tests fails on older Sandy Bridge x86_64 machine under FreeBSD (FreeBSD ports system):
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./sblat2 < ./sblat2.dat
rm -f ?BLAT3.SUMM
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./sblat3 < ./sblat3.dat
Note: The following floating-point exceptions are signalling: IEEE_DIVIDE_BY_ZERO
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat3 < ./dblat3.dat
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat2 < ./dblat2.dat
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat3 < ./cblat3.dat
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat3 < ./zblat3.dat
OMP_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
rm -f ?BLAT3.SUMM
OMP_NUM_THREADS=2 ./sblat3 < ./sblat3.dat
Note: The following floating-point exceptions are signalling: IEEE_DIVIDE_BY_ZERO
OMP_NUM_THREADS=2 ./dblat3 < ./dblat3.dat
OMP_NUM_THREADS=2 ./cblat3 < ./cblat3.dat
rm -f ?BLAT2.SUMM
OMP_NUM_THREADS=2 ./sblat2 < ./sblat2.dat
Program received signal SIGBUS: Access to an undefined portion of a memory object.
Backtrace for this error:
#0 0x824e20339 in ???
#1 0x824e1f465 in ???
#2 0x8220e746f in ???
#3 0x8220e6a3a in ???
#4 0x82157f2d2 in ???
#5 0x82f6f24ea in _Unwind_ForcedUnwind
at /usr/ports/lang/gcc13/work/gcc-13.3.0/libgcc/unwind.inc:215
#6 0x8220de21b in ???
#7 0x8220de191 in ???
#8 0x8220de03a in ???
#9 0x8220ddb29 in ???
#10 0xffffffffffffffff in ???
./sblat2 < ./sblat2.dat fails with OMP_NUM_THREADS=2 (or any non-1 value), but passes with OMP_NUM_THREADS=1.
The processor has AVX, but no AVX2 and above:
CPU: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz (2294.90-MHz K8-class CPU)
Origin="GenuineIntel" Id=0x206a7 Family=0x6 Model=0x2a Stepping=7
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2=0x1dbae3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,XSAVE,OSXSAVE,AVX>
AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
AMD Features2=0x1<LAHF>
XSAVE Features=0x1<XSAVEOPT>
VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
TSC: P-state invariant, performance statistics
The compiler is gcc13, the Makefile.rules options are:
NO_AVX2=1
NO_AVX512=1
USE_OPENMP=1
BINARY=64
However, it checked to fail regardless of whether any of USE_OPENMP, INTERFACE64 or NO_AVX set or not. It also fails the same regardless of -O level, and the failure happens in both 0.3.29 release and the HEAD at 1533fe49bef51ff49e4358a2687f1e475801f9fd, while all builds fine in older 0.3.27
Not reproducible with gcc14 and TARGET=SANDYBRIDGE on Zen5 hardware, also not reproducible with gcc4 or gcc9 on actual Sandy Bridge hardware under Linux. I still need to build gcc-14 on that old system however.
I've tried it with gcc14 instead, and it builds/passes tests successfully, so it might be some platform-specific miscompilation by gcc13, not happening in gcc14. The issue can thus be closed, but If you feel it worth checking, I suggest to try to reproduce it with gcc13 on that Linux machine + if it fails - mention that in some release notes, gcc13 is the default on Ubuntu 24.04 (last LTS release), as well as on FreeBSD stable for Fortran-coded packages