google/clspv

Ternary operator on AMD vk drivers.

stolk opened this issue · 3 comments

So, I am not sure if this is a bug in AMD drivers, or in clspv, but when I use this OpenCL construct:

const int8_t face = fac[rindex];
....
const int8_t bounced = face >= 0 ? 1 : 0;
...
const half ox = bounced ? hitx[rindex] : li[12];

then the values of ox can become 0 when they should not be, but only on AMD.

On nvidia GPU, Intel GPU, the code is fine.

The code is also fine when executed as OpenCL kernel.

But it goes awry when going through clspv on AMD GPU as vulkan spirv kernel.

I made sure to enable extensions:

VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SHADER_FLOAT16_INT8_FEATURES,

VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES,

VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES,

VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTERS_FEATURES,

The code also works if I replace the tertiary operator with a mix instruction:

const value_t ox = mix(li[12], hitx[rindex], bounced);

I tried diffing the spirv disassembly output, but the deltas are somehow very large.

I tried different transpiler options, to no avail. Last used:

CLSPVFLAGS = -g -O0 --fp16 --int8 --spv-version=1.5 --cl-native-math -DWGSZ=$(WGSZ)

Happens on other optimization levels too, and happens without native-math too. Also on different spv verions.

Observed on 3 different Radeon RX models.

Vulkan validation layer shows no issues.

OS: Ubuntu 23.04

Can you try with --decorate-nonuniform and see if that fixes your problem? You'll need to enable the descriptor indexing features on the Vulkan side too (specifically shaderStorageBufferArrayNonUniformIndexing assuming these end up in storage buffers).

Yep! With that flag, the issue goes away.

Do you know if this is caused by the amd driver, or by clspv?

It is a clspv issue, but since there might be a performance cost (and most cases don't need this) we don't enable it by default. Vulkan requires that non-uniform descriptor selection be annotated with as such.