Analyzing SUMMA with TAU A C++ OpenMPI implementation of the SUMMA parallel matrix multiplication algorithm, and subsequent analysis using the TAU parallel profiling system.