/hyperthreading

Micro-benchmarks of hyperthreading

Primary LanguageC++

Method

Parallel for loop

Multi-threaded kernel functions

  • Kernels run on any number of threads
  • Each stress the CPU in slightly different ways

Timing code

  • Timing and plotting done with a quick python script.
  • Varies number of threads from 1-8
  • Plot speedup vs number of threads per kernel

Benchmark results

Results

./figs/add.png

./figs/div.png

./figs/mem.png

Why add is different

./figs/skylake_scheduler.png

Multiple kernel benchmarks

./figs/div_add.png

./figs/add_mem.png

./figs/div_mem.png

More complex kernels

./figs/nbody.png

./figs/nbody_add.png

./figs/nbody_mem.png

Building and running IACA

Building

  • mkdir build && cd build
  • cmake -DCMAKE_BUILD_TYPE=Release ..
  • make

Running intel architecture code analyser

  • Uncomment IACA_BEGIN and IACA_END from the kernel code
  • cmake –build build
  • iaca/iaca -arch HSW build/libkernels.a