ROSS-org/ROSS

Interesting Run Time Results With Mattern`s GVT

Closed this issue · 2 comments

Hello,

I implemented Mattern`s GVT algorithm and getting almost 30% performance gain under many different configurations except a few. For example under start_events 1 and end_time 10,000 Matterns outperforms MPI_Allreduce up to 18 threads, but starting from 18 threads it runs slower. For start_events 20 and end_time 10,000 it is 12 and for start_events 1 and end_time 100,000 it is 19 number of threads which Mattern is faster.

My question is, do you think is this phenomenon due to the some feature of ROSS or the way Matterns GVT algorithm works? I am trying to figure out the reason behind the all of a sudden peak in run time under different thread numbers.

Sincerely,

Ali.

Hi @aliardaeker,

Thanks for the interesting work! Do you have any code that you can share? It would be interesting to see your implementation.

The performance differences may be due to experimental setup. Were you running on multiple nodes of an HPC cluster or just 18 threads on a single CPU? Threads can be much faster than MPI for single-node performance, but obviously can’t scale in the same way.

Ali has published his work scaling to 250 threads on a single KNL node without MPI here: https://ieeexplore.ieee.org/document/8600923