MrUnbelievable92/MaxMath

maxmath.cbrt returns NaN for an input value of 0

chadefranklin opened this issue · 2 comments

As per the title of the issue, maxmath.cbrt() returns NaN for an input of 0. This is the case for the method both Bursted and non-Bursted.

I also wanted your opinion on an issue with the explicit fnmadd_ps instructions. Is it wise to use these instructions directly rather than let Burst handle using fused instructions automatically when using FloatMode.Fast (which it does do, though differently it seems)? For my use case, I need to use FloatMode.Strict and the explicit use of fused instructions sort of goes against that. Maybe it would be worth having a separate fastcbrt() method.

Edit:
Looking further into the fused instruction issue, it seems using the fused instructions directly leads to less optimal code than without? Could you take a closer look at this?

'cbrt(0f)' returning NaN (and 'rcbrt(0f/d)' not returning NaN) is indeed an issue - fixed in the upcoming release.

We looked at the generated code. Just to recap:

  • Regarding explicit 'fmadd' and friends: It's always advantageous to use it when possible. Unfortunately Unity.Burst still does not support deterministic compilation so this library will neither.
  • Indeed LLVM sometimes chooses to generate more RISC like code (explicit load and broadcast instead of memory operands). This generates more instructions but the performance remains the same, except for maybe instruction decode bottlenecks. Additionally, the use of 'Xse.fdadd' i.e. "fake" fused-divide-add uses the rcpps instruction + some Newton-Raphson, replacing a divps instruction with 5 instructions. The latency is a little higher on the most recent CPUs but the throughput is much higher, which is relevant for tight loops, which is the only place where performance matters anyways.