flame/blis

Regarding Default Behaviour for CPU Affinity

mert-kurttutan opened this issue · 4 comments

I have script that runs simple blis sgemm code (allocate vector for a, b, c and run sgemm) on sequantial mode (with OMP_NUM_THREADS=1 with OpenMP multithreading enabled, so it uses 1 thread only).
It gives consistently different runtime performance when the affinity is set to different cpu cores via taskset command in linux.

taskset -c 0 ./blis    # time 1.96 sec
taskset -c 1 ./blis   # time 1.70 sec

I did not change anything in blis regarding affinity, and compiled as instructed in wiki.

If I did not use taskset to set the available cpu core, blis seems to be choosing the faster one (e.g. core 1).

./blis   # time 1.70 sec

I read the wiki, the doccumentation talks about how it can be changed, used to solve affinity-related problems. But, I could not see anything related to the default behaviour on the affinity.

So, what is the default way blis handles the affinity? I just want to see if you have insight without requiring other info from me (e.g. hardware). If you want, I can provide hardware info and other specs

Thanks, I just checked the cpu clock speed using htop. Each cpu has different frequency and it checks out with the results from the timing of blis.
But, it is still very curious that blis consistently chooses the processor so that the result is the fastest.

When I compiled the blis with OpenMP multithreading, is it possible that affinity is somehow handled by OpenMP, or is it done by C?