Cannot compile popcnt-avx2-harley-seal.cpp using MSVC 2015
kimwalisch opened this issue · 6 comments
Hi Wojciech,
I use your popcnt-avx2-harley-seal
algorithm in my libpopcnt.h
. Unfortunately it fails to compile on Windows using a recent MSVC 2015 compiler version:
C:\Users\kim\Desktop\libpopcnt-master>nmake -f Makefile.msvc
Microsoft (R) Program Maintenance Utility Version 14.00.24210.0
Copyright (C) Microsoft Corporation. All rights reserved.
cl /nologo /W3 /O2 /EHsc /D HAVE_POPCNT /arch:AVX2 /D HAVE_AVX2 test.cpp /Fotest.obj /Fetest.exe
test.cpp
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(356): error C2676: binary '&': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(356): error C2660: '_mm256_sub_epi8': function does not take 1 arguments
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(357): error C2676: binary '&': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(357): error C2088: '&': illegal for union
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(357): error C2660: '_mm256_add_epi8': function does not take 1 arguments
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(358): error C2676: binary '&': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(369): error C2676: binary '^': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(370): error C2676: binary '&': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(370): error C2088: '&': illegal for union
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(371): error C2676: binary '^': '__m256i' does not define this operator or a conversion to a type acceptable to the predefined operator
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(423): error C3861: '_mm256_extract_epi64': identifier not found
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(424): error C3861: '_mm256_extract_epi64': identifier not found
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(425): error C3861: '_mm256_extract_epi64': identifier not found
c:\users\kim\desktop\libpopcnt-master\libpopcnt.h(426): error C3861: '_mm256_extract_epi64': identifier not found
NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.EXE"' : return code '0x2'
Stop.
It seems like the MSVC compiler has poor AVX2 support (e.g. the _mm256_extract_epi64
intrinsic seems to be missing)!? Have you ever tried compiling your sse-popcount
project using MSVC?
Here is a link to my libpopcnt.h
Thanks,
Kim
Kim, although I never compiled this code with MSVC, I managed to write AVX2 code at work, where we use MSVC.
It seems that MSVC has no predefined operators &, ^ and | for AVX2 types. Try replace them with intrinsics mm256{and,xor,or}_si256, should help.
Thanks, do you also know a workaround for the missing mm256_extract_epi64 intrinsic on MSVC?
Best regards,
Kim
Kim, unfortunately I don't know, but I will check it for you on Monday, at work.
For now, I would try with two _mm256_extractf128_si256
(if it's supported...) followed by _mm_extract_epi64
. There is a chance that impact on performance will be negligible.
I have found a pure C++ solution:
uint64_t* total64 = (uint64_t*) &total;
return total64[0] +
total64[1] +
total64[2] +
total64[3];
The performance should be the same as your original code.
Yes. Actually, the extract intrinsic is not all that useful. I think it should only be used when you seek to extract one value from the register.
@WojciechMula I was able to compile your popcnt-avx2-harley-seal
algorithm using MSVC by adding the following code:
#if defined(_MSC_VER)
/// Define missing & operator overload for __m256i type on MSVC compiler
inline __m256i operator&(const __m256i a, const __m256i b)
{
return _mm256_and_si256(a, b);
}
/// Define missing | operator overload for __m256i type on MSVC compiler
inline __m256i operator|(const __m256i a, const __m256i b)
{
return _mm256_or_si256(a, b);
}
/// Define missing ^ operator overload for __m256i type on MSVC compiler
inline __m256i operator^(const __m256i a, const __m256i b)
{
return _mm256_xor_si256(a, b);
}
#endif /* _MSC_VER */