GameTechDev/ISPCTextureCompressor

ISPC 1.14 issues and suggested fixes/workarounds

aras-p opened this issue · 3 comments

At Unity we're finding that just updating the underlying ISPC compiler to 1.14 version gives a small compression speed increase (3-5% for BC7 & BC6H). That's cool! However the source code needs some fixes:

Integer type defs

ISPC now defines sized integer types, so ispc_texcomp/kernel.ispc needs removal of:

typedef unsigned int8 uint8;
typedef unsigned int32 uint32;
typedef unsigned int64 uint64;

(a similar change is done in #27)

ASTC dual plane bool

Not sure if due to new ISPC, or due to more recent C++ compiler, but the ASTC compressor dual plane flag was producing wrong results. Looks like ISPC produces 0 and 255 values for "bool", but some Clang optimizations assume it will only contain 0 and 1 values. Then the C++ code that does int D = !!block->dual_plane; and expects to produce 0 or 1 ends up producing 0 or 255 too, which when going into the ASTC block bitfields leads to much hilarity. Changing bool dual_plane; to uint8_t dual_plane; in ispc_texcomp_astc.cpp; and uniform bool dual_plane; to uniform uint8_t dual_plane; in kernel_astc.ispc fixes the issue.

ASTC solid color blocks

Solid color blocks in ASTC formats are encoded wrongly. In kernel_astc.ispc, ls_refine_scale() function for solid color blocks, sum_w and sum_ww can be zeroes, which makes sgesv2 return NaNs in xx[0] and xx[1], resulting in NaN scale too. Changing this:

if (scale > 0.9999) scale = 0.9999;
if (scale < 0) scale = 0;

to use clamp function fixes the issue:

scale = clamp(scale, 0.0f, 0.9999f); // note: clamp also takes care of possible NaNs

The clamp() fix does not work on aarch64 (I tried it with ISPC 1.13, 1.14 and 1.15), it behaves differently for NaNs than on x64. Workaround:

if ( !isnan( scale ) )
{
    scale = clamp(scale, 0.0f, 0.9999f); // note: clamp also takes care of possible NaNs
}
else
{
    scale = 0.0f;
}

On a side note, the change from uniform bool dual_plane; to uniform uint8_t dual_plane; in ispc_texcomp_astc.cpp, line 1301 has not been integrated.

thanks for catching this. I've added a check for divide by zero to avoid. i'll follow up with ispc team to make the behavior more consistent in the future.