Consider host-architecture aware compiler optimization

Question

Consider host-architecture aware compiler optimization

stv0g opened this issue 2 years ago · 4 comments

By passing the correct -march flag:

Especially for -march=native:

This selects the CPU to generate code for at compilation time by determining the processor type of the compiling machine. Using -march=native enables all instruction subsets supported by the local machine (hence the result might not run on different machines). Using -mtune=native produces code optimized for the local machine under the constraints of the selected instruction set.

The default is usually x86-64

A generic CPU with 64-bit extensions.

Which means that no newer instruction subsets are enabled.

Also interesting are x86-64-v2, x86-64-v3, x86-64-v4:

These choices for cpu-type select the corresponding micro-architecture level from the x86-64 psABI. On ABIs other than the x86-64 psABI they select the same CPU features as the x86-64 psABI documents for the particular micro-architecture level.

Answer 1 · 2023-01-26T08:50:23.000Z

I just saw the compiler flags which OPAL-RT is using to compile their model in RT-LAB:

gcc -c  -O3 -ffast-math -mtune=native -march=native -falign-loops=2 -falign-jumps=2 -falign-functions=2 -m64

Answer 2 · 2023-01-26T08:55:10.000Z

-ffast-math is interesting. I also learned something new here:

Answer 3 · 2023-01-26T08:56:14.000Z

@dinkelbachjan @m-mirz I think these compiler flags might have a quite high impact on DPsim's performance.

Do we have a profiling/benchmark script which I could use to verify this?
Or maybe a student who could run some comparisons using these flags?

Answer 4 · 2023-01-26T08:58:12.000Z

I am assigning me and @fwege as this is part of the real-time optimization task in SEGuRo