FMA4 used even if unsupported
olifre opened this issue · 6 comments
With current git HEAD, FMA4 is used (unconditionally?) and I get, on my not-so-very-old Haswell machine:
Program received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7ffea7b64700 (LWP 17272)]
0x00007ffea8423378 in nnedi3_e0_m16_FMA4 () from /usr/lib64/libnnedi3.so
If you need further info (/proc/cpuinfo etc.) just ask.
Please paste the output of cpuid --one-cpu --raw
(http://www.etallen.com/cpuid.html). The only thing I can think of is that I'm doing the CPU feature detection slightly wrong. I need that output to verify.
Here you go:
CPU:
0x00000000 0x00: eax=0x0000000d ebx=0x756e6547 ecx=0x6c65746e edx=0x49656e69
0x00000001 0x00: eax=0x000306c3 ebx=0x07100800 ecx=0x7ffafbff edx=0xbfebfbff
0x00000002 0x00: eax=0x76036301 ebx=0x00f0b5ff ecx=0x00000000 edx=0x00c10000
0x00000003 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x00000004 0x00: eax=0x1c004121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000
0x00000004 0x01: eax=0x1c004122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000000
0x00000004 0x02: eax=0x1c004143 ebx=0x01c0003f ecx=0x000001ff edx=0x00000000
0x00000004 0x03: eax=0x1c03c163 ebx=0x03c0003f ecx=0x00001fff edx=0x00000006
0x00000005 0x00: eax=0x00000040 ebx=0x00000040 ecx=0x00000003 edx=0x00042120
0x00000006 0x00: eax=0x00000077 ebx=0x00000002 ecx=0x00000009 edx=0x00000000
0x00000007 0x00: eax=0x00000000 ebx=0x000027ab ecx=0x00000000 edx=0x00000000
0x00000008 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x00000009 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x0000000a 0x00: eax=0x07300403 ebx=0x00000000 ecx=0x00000000 edx=0x00000603
0x0000000b 0x00: eax=0x00000001 ebx=0x00000002 ecx=0x00000100 edx=0x00000007
0x0000000b 0x01: eax=0x00000004 ebx=0x00000008 ecx=0x00000201 edx=0x00000007
0x0000000c 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x0000000d 0x00: eax=0x00000007 ebx=0x00000340 ecx=0x00000340 edx=0x00000000
0x0000000d 0x02: eax=0x00000100 ebx=0x00000240 ecx=0x00000000 edx=0x00000000
0x0000000d 0x3e: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x80000000 0x00: eax=0x80000008 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x80000001 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000021 edx=0x2c100800
0x80000002 0x00: eax=0x65746e49 ebx=0x2952286c ecx=0x726f4320 edx=0x4d542865
0x80000003 0x00: eax=0x37692029 ebx=0x3139342d ecx=0x20514d30 edx=0x20555043
0x80000004 0x00: eax=0x2e322040 ebx=0x48473039 ecx=0x0000007a edx=0x00000000
0x80000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x80000006 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x01006040 edx=0x00000000
0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
0x80000008 0x00: eax=0x00003027 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
0x80860000 0x00: eax=0x00000007 ebx=0x00000340 ecx=0x00000340 edx=0x00000000
0xc0000000 0x00: eax=0x00000007 ebx=0x00000340 ecx=0x00000340 edx=0x00000000
It's a Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz, by the way (which turbo's up to 4.1 GHz).
Okay. The FMA4 bit is not set, so that's not it. I suppose you'll just have to use a debugger to find out what happens, because I don't see anything wrong in the code. Do you need detailed instructions for using gdb?
Well, this is extremely strange... I tried it just now again, on the same machine, with the same video, same vapoursynth-script, inside mpv, and do not see the SIGILL from nnedi3 running FMA4 anymore.
Microcode did not / should not have changed. I did not recompile, neither vapoursynth nor the nnedi3 plugin.
With valgrind, I also see only tons of python3-invalid-reads (probably some false positives which need a suppression file, I am not that experienced with python itself) and nothing from the nnedi3 library.
I am out of explanations, but this was reproducible just yesterday on the same machine...
Let's assume the problem fixed itself. :)
You're right, let's assume that for now - if it ever pops back up again, I'll for sure gdb it directly ;-).