- Custom multi-threading code to create parallel operations.
- Same number of threads launched, different number of jobs submitted
- Kernels run on any number of threads
- Each stress the CPU in slightly different ways
- Timing and plotting done with a quick python script.
- Varies number of threads from 1-8
- Plot speedup vs number of threads per kernel
- mkdir build && cd build
- cmake -DCMAKE_BUILD_TYPE=Release ..
- make
- Uncomment IACA_BEGIN and IACA_END from the kernel code
- cmake –build build
- iaca/iaca -arch HSW build/libkernels.a