XboxDev/nxdk-pdclib

math.h is not optimized for Xbox

Closed this issue · 2 comments

I was surprised to find that trunc is done using x87 by modifying the control word.

Xbox should have SSE, so we can use faster methods like these http://wurstcaptures.untergrund.net/assembler_tricks.html

This isn't necessarily much faster, but hopefully shorter code which is easier to read (and step through).

We should also review other math.h funtions like round. Also possibly affects #7 (future floor implementation).

I think we can safely ignore any precision issues beyond 32 / 64 bit IEEE float.

I looked into those a bit (especially the SSE-based trunc because it's so much simpler), trying them out on VS2017. Unfortunately, they don't really deliver on their promise - it only computes the nearest integer if that integer is representable by a 32-bit integer type (and even that's not entirely true). This could be fine for something like MSVC's _ftol, but as a replacement for truncf, this is a pretty serious flaw imho (trunc_SSE(INT_MAX) produces a negative number, trunc_SSE(FLT_MAX) produces a completely bogus result, while truncf() is correct for both on VS2017 - it's even worse for INFINITY).

Looking around, the control word based approach (while ugly and probably slow af) seems to be pretty common - I really wish Intel would've done a better job with their FPU, other platforms seem to have much nicer instructions for this.

@thrimbor is right. I don't think we need this as a reminder. Everyone working on homebrew should know better than to fallback to libc for high-performance code (and even then it's probably rarely an issue).

Closing.