NVIDIAGameWorks/RayTracingDenoiser

SSE instruction throws illegal instruction.

Hearwindsaying opened this issue · 9 comments

Is there any requirements for CPU? My CPU is 2 E5-2665, which should support SSE instructions.
For any sample scene, the program always crashes at these SSE instructions:
捕获

It turns out that my CPU does not support AVX2. Illegal Instruction is thrown consequently. Is there any fallback layer for SSE?
Thanks!

#define PLATFORM_INTRINSIC_SSE3 0 // NOTE: +SSSE3
#define PLATFORM_INTRINSIC_SSE4 1
#define PLATFORM_INTRINSIC_AVX1 2 // NOTE: +FP16C
#define PLATFORM_INTRINSIC_AVX2 3 // NOTE: +FMA3

#if (defined(_MSC_VER) && (_MSC_VER >= 1920)) || defined(clang) || defined(GNUC)
// TODO: disable __m256d emulation if VS2019 is used
#define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_AVX2
#else
#define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_SSE4
#endif

You can modify PLATFORM_INTRINSIC to something what matches your CPU.

It's in "platform.h"

Thanks for your advice.
Actually I have done this before:
#define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_SSE3
Or as PLATFORM_INTRINSIC_SSE4, PLATFORM_INTRINSIC_AVX1.

However, I got compiler errors about "cannot convert v4d to __m256d". (Sorry I forget the details and do not have access to my project at the moment. But the errors are all of those conversion failure in my mind.)
But the original code:
#define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_AVX2
do compile at VS 2019 without warnings.

Any insights?
Thanks!

Apparently I am facing the same issue as @Hearwindsaying.

If you simply set PLATFORM_INTRINSIC to PLATFORM_INTRINSIC_SSE4 (or lower) when building with MSVC, a bunch of type conversion errors like this occurs:

'__m256d _mm256_sin_pd(__m256d)': cannot convert argument 1 from 'const emu__m256d' to '__m256d'

It turns out that fallback implementations for _mm256_XXX_pd functions don't get pulled in as they're surrounded with PLATFORM_HAS_SVML_INTRISICS compile time condition (see MathLib_d.h).

Disabling the SVML guard alltogether breaks single precision routines like so:

'_mm_tan_ps': ambiguous call to overloaded function

A somewhat working workaround is to disable the check just in MathLib_d.h for non-AVX builds.

However, this doesn't solve the initial issue: the example code still crashes with illegal instrunction with stack pointing to

>	09_RayTracing_NRD.exe!Zbuffer::`dynamic initializer for 'DepthNear''() Line 131	C++

in my case. Any ideas?

[EDIT]

After some more poking around I've found that AVX instruction set is also enabled in the MSVC project settings by default.
Removing the /arch switch solves the issue. This has to be done for the Sample projects.

Thanks for digging into this. I hope it will be fixed in the next submit:

  • platform.h takes ARCH from generated VCXPROJ files
  • default ARCH changed from AVX to SSE4.1
  • resolved "__m256" vs "emu__m256" mystery
  • I tried SSE4.1 / AVX / AVX2 in VS2017 / VS2019, it compiled \O/

Great news, thank you 👍

Updated. Please, check and close if it works as expected.

Works fine for me.

Works for me! Thanks to all folks!