LaihoE/SIMD-itertools

Further performance "improvements" using intrinsics

PaulDotSH opened this issue · 3 comments

Hi, would a PR using intrinsics be allowed? Since this crate is already requiring nightly, I was thinking that a very small performance improvement might come from the usage of intrinsics, however my hardware isn't really fit to detect small performance improvements, I ran the benchmarks twice and got different results, after using intrinsics, the performance improvement should be very small anyway.

Before

       if arr.is_empty() {
            return true;
        }

After

       if std::intrinsics::unlikely(arr.is_empty()) {
            return true;
        }

Checked using godbold and apparently llvm is smart enough to already "hypothesize" that the code won't be called on an empty array usually and both examples generate the same assembly

I would be down to include intrinsics if you find anything that speeds it up.

I would be down to include intrinsics if you find anything that speeds it up.

At the moment the generated assembly is the same, so if there aren't any compiler regressions (there shouldn't be, and if there will, they will probably be fixed), it shouldn't make any difference