shankar1729/jdftx

tests failure with intel compiler

Closed this issue · 2 comments

I am trying to compile the code using intel oneapi-2023.1(Intel MPI), the compilation finishes without any problem, but when I try to run the tests, they all fail and the following appears in the jdftx-stacktrace:

/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_Z10printStackb+0x20) [0x15220afbcb00]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_Z14stackTraceExiti+0xf) [0x15220afbceff]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x1521f79ee520]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(ZNK14PeriodicLookupI7vector3IdEE4findIdSt8equal_toIdEEEmS1_T_PKSt6vectorIS6_SaIS6_EET0+0x180) [0x15220b0d3fe0]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_ZNK10Symmetries14findSpaceGroupERKSt6vectorI7matrix3IiESaIS2_EE+0x67f) [0x15220b237b7f]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_ZN10Symmetries14calcSymmetriesEv+0x839) [0x15220b2369c9]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_ZN10Symmetries5setupERK10Everything+0x44) [0x15220b235e94]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/lib/libjdftx.so(_ZN10Everything5setupEv+0x63) [0x15220b0d5e93]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/bin/jdftx() [0x40b124]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x1521f79d5d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x1521f79d5e40]
/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1/bin/jdftx() [0x40a805]

the cmake configure options are:

CC=mpiicc CXX=mpiicpc cmake .. -DCMAKE_INSTALL_PREFIX=/softw/qsoftw/jdftx/1.7.0-commits-29f622b-mkl_sec-oneapi-2023.1
-D EnableMPI=yes -D EnableMKL=yes -D EnableLibXC=yes
-D GSL_PATH=/softw/libs/intel/gsl/2.7.1-oneapi-2023.1 -D MKL_PATH=${MKLROOT}
-D LIBXC_PATH=/softw/libs/intel/libxc/6.2.2-oneapi-2023.1
-DCMAKE_CXX_FLAGS="-xHOST "
-DCMAKE_C_FLAGS="-xHOST "

I was able to compile the code with the AMD AOCC compiler, AOCL library and OpenMPI, and it works correctly, but I don't know why it doesn't work with Intel compilers.

We've always had unpredictable issues with Intel compilers and MPI, and never with a clear cause. Our recommendation here has been to use GNU compilers, but link with Intel MKL, for Intel CPUs. This yields the same performance as using the Intel compilers, so there is no point in fighting an uphill battle in getting those to work.

As for the MPI, both OpenMPI and MPICH variants work fine. I suspect that the issue is not with Intel MPI (which is an MPICH variant), but that option mentioned above invokes the Intel C++ compiler. In particular, the stack trace you present shows up an error in a serial initialization segment of the code.

Best,
Shankar

Dear Shankar

Thanks for the response, I was able to compile successfully with gcc compiler both the cpu and gpu versions

Best,

Wilver