facebook/zstd

[clang][windows] x64 assembly in i686 code.

Nemirtingas opened this issue · 3 comments

Here, the code uses x64 registries while the define works on both x64 and i686, should add another else to build 32bits code.

zstd/lib/common/cpu.h

Lines 55 to 87 in 2dddf09

/* Clang compiler has a bug (fixed in https://reviews.llvm.org/D101338) in
* which the `__cpuid` intrinsic does not save and restore `rbx` as it needs
* to due to being a reserved register. So in that case, do the `cpuid`
* ourselves. Clang supports inline assembly anyway.
*/
U32 n;
__asm__(
"pushq %%rbx\n\t"
"cpuid\n\t"
"popq %%rbx\n\t"
: "=a"(n)
: "a"(0)
: "rcx", "rdx");
if (n >= 1) {
U32 f1a;
__asm__(
"pushq %%rbx\n\t"
"cpuid\n\t"
"popq %%rbx\n\t"
: "=a"(f1a), "=c"(f1c), "=d"(f1d)
: "a"(1)
:);
}
if (n >= 7) {
__asm__(
"pushq %%rbx\n\t"
"cpuid\n\t"
"movq %%rbx, %%rax\n\t"
"popq %%rbx"
: "=a"(f7b), "=c"(f7c)
: "a"(7), "c"(0)
: "rdx");
}

diff --git a/lib/common/cpu.h b/lib/common/cpu.h 
index 5d514ccb..dd79efb7 100644 
--- a/lib/common/cpu.h 
+++ b/lib/common/cpu.h 
@@ -60,2 +60,3 @@ MEM_STATIC ZSTD_cpuid_t ZSTD_cpuid(void) 
     U32 n; 
+#if defined(_M_X64) 
     __asm__( 
@@ -87,2 +88,31 @@ MEM_STATIC ZSTD_cpuid_t ZSTD_cpuid(void) 
     } 
+#else 
+    __asm__( 
+        "push %%ebx\n\t" 
+        "cpuid\n\t" 
+        "pop %%ebx\n\t" 
+        : "=a"(n) 
+        : "a"(0) 
+        : "ecx", "edx"); 
+    if (n >= 1) { 
+      U32 f1a; 
+      __asm__( 
+          "push %%ebx\n\t" 
+          "cpuid\n\t" 
+          "pop %%ebx\n\t" 
+          : "=a"(f1a), "=c"(f1c), "=d"(f1d) 
+          : "a"(1) 
+          :); 
+    } 
+    if (n >= 7) { 
+      __asm__( 
+          "push %%ebx\n\t" 
+          "cpuid\n\t" 
+          "mov %%ebx, %%eax\n\t" 
+          "pop %%ebx" 
+          : "=a"(f7b), "=c"(f7c) 
+          : "a"(7), "c"(0) 
+          : "edx"); 
+    } 
+#endif 
 #endif 

Is that related to #4118 ?

Yes, it is. I'm using clang 17, should update zstd to not even bother with this assembly issue :)

Would the 32 bit code work on 64 bit cpu? It is compatible with with 32 bit code so I guess it would. It would also reduce a complexity of conditional compilation. Cpuid is not some speed critical instruction so it wouldn't make much difference, if any.