Multichase - a pointer chaser benchmark Multiload - a superset of multichase which runs latency, memory bandwidth, and loaded-latency 1/ BUILD - just type: $ make 2/ INSTALL - just run from current directory or copy multichase wherever you need to 3.1/ RUN Multichase - To get help $ multichase -h - By default, multichase will perform a pointer chase through an array size of 256MB and a stride size of 256 bytes for 2.5 seconds on a single thread: $ multichase - Pointer chase through an array of 4MB with a stride size of 64 bytes: $ multichase -m 4m -s 64 - Pointer chase through an array of 1GB for 10 seconds (-n is the number of 0.5 second samples): $ multichase -m 1g -n 20 - Pointer chase through an array of 256KB with a stride size of 128 bytes on 2 threads. Thread 0 accesses every 128th byte, thread 1 accesses every 128th byte offset by sizeof(void*)=8 on 64bit architectures: $ multichase -m 256k -s 128 -t 2 3.2/ RUN Multiload - Latency Only (simple pointer chase) In this mode, Multiload can run any of the multichase commands above. A "-c" chase arg (other than chaseload) can be used or it will default to "simple". Using either "-c chaseload" and/or the "-l" load arguments will choose a different test mode. $ multiload - Bandwidth Only Multiload can run a memory bandwidth test using the "-l" load argument. The "-c" chase argument MUST NOT be used. Below command runs 5 samples (~2.5 seconds each), using 16 threads, using the glibc memcpy() function, using a 512M buffer per thread. $ multiload -n 5 -t 16 -m 512M -l memcpy-libc - Loaded Latency. Multiload can run 1 pointer chaser thread on logical cpu0 with multiple memory bandwidth load threads. The "-c chaseload" arg MUST be used. The "-l" arg MUST be used with one of the memory load arguments. Below command runs 5 samples (~2.5 seconds each), on 16 threads (1 chase, 15 stream-sum bandwidth loads), using a 512M buffer per thread. The chase thread uses a stride=16. $ multiload -s 16 -n 5 -t 16 -m 512M -c chaseload -l stream-sum 3.3/ RUN Pingpong & fairness - Pingpong: measure latency of exchanging a line between cores. To run, simply do: $ pingpong -u - Fairness: measure fairness with N threads competing to increment an atomic variable. To run, simply do: $ fairness