sz_capabilities might be incorrect for AVX512
Closed this issue ยท 4 comments
The sz_capabilities
function seems to return incorrect values for certain AVX512 instruction sets.
lscpu gives:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 36 bits physical, 48 bits virtual
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 94
Model name: Intel(R) Core(TM) i5-6500T CPU @ 2.50GHz
Stepping: 3
CPU MHz: 2496.000
CPU max MHz: 2496.0000
BogoMIPS: 4992.00
Hypervisor vendor: Windows Subsystem for Linux
Virtualization type: container
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse
sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm pni pclmulqdq dtes64 est tm2 ssse3 fma cx16 xtpr pdcm
pcid sse4_1 sse4_2 movbe popcnt aes xsave osxsave avx f16c rdrand hypervisor lahf_lm abm
3dnowprefetch fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap
clflushopt ibrs ibpb stibp ssbd
But the python stringzilla.__capabilities__
gives:
'serial,avx2,avx512vbmi,gfni,'
If I understand cpuid
correctly, info1
refers to calling cpuid
with eax==1
(From https://en.wikipedia.org/wiki/CPUID).
Then the following code in sz_capabilites
is incorrect.
Lines 78 to 83 in 1a4b05d
So the info1
s should instead be info7
s.
As currently:
supports_avx512vbmi
: Actually representspclmulqdq
(Supported).supports_avx512vbmi2
: Actually representssmx
(Not Supported).supports_gfni
: Actually representstm2
(Supported).
Yes, @ashbob999! You may be right. If you have a patch, will be happy to test on my end as well ๐ค
I haven't made a patch yet, but it should be simply enough to do.
Just replace the 3 info1
s with info7
s.
I can create a PR with the patch, and test it myself, but it won't be till the weekend.
Yes, that fixes the issue! It would be great if we had similar CPU-capability checks for Arm.
๐ This issue has been resolved in version 3.8.2 ๐
The release is available on GitHub release
Your semantic-release bot ๐ฆ๐