Like many people, I've had access to a bunch of machines over the years and I wanted to keep a record of their performance. Think of this as a personal diary of the performance of some machine I've played with.
Matrix-Matrix multiplication is often used to benchmark machines because the mathematics is such that it is one of the few operations where one can obtain close to theoretical peak performance in pratice.
The number of floating point operations(Flops) in a Matrix-Matrix multiplication of two
For this benchmark, we construct two random
For highest performance, you should use a version of numpy that has been linked against a high performance BLAS library such as OpenBLAS or the Intel MKL(https://software.intel.com/en-us/intel-mkl). The Anaconda Python distribution includes the Intel MKL by default on Windows and Linux (Mac includes its own high performance BLAS library).
If you look at individual results notebooks you'll notice that the notebooks have evolved a little over time. The core computation is always the same though.
- Amazon c5x18xlarge, November 2017, Max size 10000x10000, 1366 Gflops
- Azure Notebook, May 2017, Max size 1000x1000, 263 Gflops - This was a free service offered by Microsoft. Discussed at https://walkingrandomly.com/?p=6351
- Amazon c4x4xlarge, September 2017, Max size 10000x10000, 333 Gflops
- 2018 Macbook Pro, Dec 2018, Max Size 15000 x 15000, 291 Gflops
- Mid 2014 Macbook Pro, May 2017, Max Size 10000 x 10000, 169 Gflops
- Dell XPS9560, Intel Kaby Lake, May 2017, Max size 10000 x 10000, 141 Gflops
- Microsoft Surface Book 2 August 2020, Max Size 10000 x 10000, 120 Gflops
Results from various traditional HPC Clusters.
- Sharc-32 - 802 Gflops: 32 core Broadwell Nodes (2 sockets) that were added to ShARC thanks to a grant I won. CPUs were released in Q1 2016
- Sharc-16 - 458 Gflops: 16 core Haswell Nodes (2 sockets) that formed the basis of Sharc: Sheffield' successor to Iceberg. CPUs were released in Q3 2014
- Iceberg-16 - 333 Gflops: 16 core Ivy Bridge nodes (2 sockets) that were added to Iceberg after a few years of operation. CPUs were released in Q3 2013
- Iceberg-12, 120 Gflops: 12 Core Intel Westmere nodes (2 sockets) from University of Sheffield's 'Iceberg' Cluster. They were old when I ran the benchmark: The CPUs were released in 2010