Warwick-Plasma/epoch

Parallel computing speed issues

pistachio-xin opened this issue · 1 comments

@Status-Mirror Hello, when I used epoch simulation calculations, I found that the higher the parallelism used, the calculation time actually increased. When the parallelism is 16, it is the fastest. If it exceeds 16, the calculation time increases. Of course, 16 may be related to personal devices. What I would like to ask you is, is this phenomenon of optimal parallelism normal? Or is there something wrong with some of my settings?

What you're seeing is (probably) a result of your total problem size.

For a fixed problem size (total number of particles in EPOCH's case), as you add more processors you will (eventually) see a smaller speed-up. Essentially this is Amdahl's law. Additionally as you add in more processors (and make the subdomain increasingly small), you are requiring more communication between more processors, and typically this communication is done using an increasing number of increasingly small messages. Eventually you reach a point where the small gains you see in compute speed from adding more processors is smaller than the increased cost in communication, and your total run time will increase.

I'm skipping over further details (such as improved cache performance, and the performance hit when going from intra-node communication to inter-node communication, which in your particular case may be important), but the general description above is a good summary.

In conclusion - for any fixed problem size there will be an optimal parallelisation strategy. How you define optimal (wall time vs CPU hours, for example) and how your problem is defined, will dictate which strategy you should use.