Use better default sum routines
pietern opened this issue · 1 comments
pietern commented
See https://github.com/pytorch/benchmark/blob/master/timing/cpp/benchmarks/avx_sum.cpp#L94 for a survey of different SIMD sum routines. Benchmarks indicate that sum_simple_128 is one of the fastest if AVX is available, per @cpuhrsch.
pietern commented
No need for this.