Low GPU utilization with default protocol.
Closed this issue · 1 comments
kexul commented
Hi, I managed to run the simulation with the system prepared by protocaller. I found that the GPU utilization is 3%~5% with my Tesla P40 card. Is that normal? Here is my log about time consuming.
R E A L C Y C L E A N D T I M E A C C O U N T I N G
On 1 MPI rank, each using 40 OpenMP threads
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
-----------------------------------------------------------------------------
Neighbor search 1 40 501 6.972 613.563 1.0
Launch GPU ops. 1 40 50001 4.305 378.852 0.6
Force 1 40 50001 216.936 19090.346 30.5
PME mesh 1 40 50001 364.126 32043.018 51.3
Wait Bonded GPU 1 40 1001 0.007 0.574 0.0
Wait GPU NB local 1 40 50001 0.760 66.924 0.1
NB X/F buffer ops. 1 40 99501 20.326 1788.689 2.9
Write traj. 1 40 11 0.262 23.098 0.0
Update 1 40 100002 46.367 4080.316 6.5
Constraints 1 40 100004 44.149 3885.090 6.2
Rest 6.218 547.217 0.9
-----------------------------------------------------------------------------
Total 710.429 62517.687 100.0
-----------------------------------------------------------------------------
Breakdown of PME mesh computation
-----------------------------------------------------------------------------
PME spread 1 40 100002 118.548 10432.218 16.7
PME gather 1 40 100002 94.478 8314.080 13.3
PME 3D-FFT 1 40 200004 131.213 11546.690 18.5
PME solve Elec 1 40 100002 1.376 121.094 0.2
-----------------------------------------------------------------------------
kexul commented
With the new 2021.beta2 version, the GPU utilization can reach 20% now.