Triple-Z/AVX-AVX2-Example-Code

Compiler optimization will discard SIMD extension flags if CPU doesn't support them

Closed this issue · 1 comments

rilysh commented

Bringing #1

In the makefile, there's -O flag specified. For an x86 CPU that doesn't support AVX and AVX2 instruction set with GCC's optimization flag enabled, the compiler will fall back and will not use these instructions, thus on runtime the program will not throw any "illegal instruction" error.

Both GCC and Clang do the same.
Example

#include <immintrin.h>
#include <stdio.h>

int main()
{
    __m256i v0 = _mm256_set1_epi8(10);
    __m256i v1 = _mm256_set1_epi8(10);
    __m256i r = _mm256_add_epi8(v0, v1);
    
    printf("%d", (char *)&r[0]);
}

For example with AVX2:
With gcc main.c -mavx2 creates an unoptimized binary output. Now if the CPU doesn't support AVX2, the program will return an illegal instruction error. However, here every program is compiled with optimization (-O) flag, and now GCC checks if the CPU has AVX2 support and if not GCC generates an output binary that doesn't have AVX2 support.

Removing -O flag from makefiles actually can give real results. Otherwise the output value might not actually be what a user thinks.

Thanks for your advice, maybe I will file a PR later.