astojanov/Clover

About some bugs in Clover/include/CloverVector32.h

Closed this issue · 4 comments

Hi,
I tried to run your using-clover-library example showed on README.md and here is my compiling instruction:
g++ -std=c++11 -mavx2 -I/workspace/Clover/include/ example_clover.c -o exe

Unfortunately, my compilation failed at CloverVector32.h:437, 438, 439, and 440 since mul1-4 was not declared. (Line 500-503 happened the same problem.)

I fixed that declaration problem,but sadly ran into another error now:

In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:46:0,
from /usr/include/x86_64-linux-gnu/c++/5/bits/opt_random.h:33,
from /usr/include/c++/5/random:50,
from /usr/include/c++/5/bits/stl_algo.h:66,
from /usr/include/c++/5/algorithm:62,
from /workspace/Clover/include/CloverVector32.h:30,
from example_clover.c:1:
/usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h: In static member function 'static uint64_t CloverRandom::get_random_uint64()':
/usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:184:1: error: inlining failed in call to always_inline 'int _rdrand64_step(long long unsigned int*)': target specific option mismatch
_rdrand64_step (unsigned long long *__P)
^
In file included from /workspace/Clover/include/CloverVector.h:38:0,
from /workspace/Clover/include/CloverVector32.h:34,
from example_clover.c:1:
/workspace/Clover/include/CloverRandom.h:55:40: error: called from here
ret = _rdrand64_step(&rnd1);
^
Makefile:5: recipe for target 'all' failed
make: *** [all] Error 1

I do run Clover on x86-64 machine with AVX2 and I pass the clover executable validation.
Thank you for your contribution on low precision linear algebra library!
If you could help me with this problem, I would really appreciate it!

Hi @CornerSluggish , the problem here is that you are compiling with -mavx2 flag only which creates all the issues.

Line CloverVector32.h:437 fails because FMA is not available and thus the compiler switches to non-FMA solution (which had the minor bug of not having mul1-4 defined - as you already noticed). To enable FMA, you might want to pass -mfma flag, assuming your architecture supports it.

CloverRandom.h:55:40 fails as RDRAND instructions are necessary. To enable those, please use -mrdrnd flag, if available on your architecture.

Note that instead of enabling specific instructions sets per architecture, a better approach would be to use the -march argument and specify your architecture. I did most of the tests on Haswell architecture, thus the CMakeLists.txt:62 compiles the executable with -O3 -std=c++11 -march=haswell -fopenmp.

If your validations has passed, it probably means that you are running the code on a Haswell machine or later one, and thus compilation was successful. Therefore, please use the according flags for your architecture. Note that the code has dependencies on Intel MKL libraries, thus you might want to try to compile everything with Intel ICC.

Hi astojanov,
Thank you for your quick response and assistance!
I am not quite familiar with how to compile SIMD codes to make it work.
I will try this soon.
Thanks again! m(_ _)m

Hi @astojanov ,
I have compiled and linked successfully by using Intel icpc with the following Flags:
-march=core-avx2 -fopenmp -O3 -mkl -ipp
Thanks for your suggestion~~

Glad to hear that.