why using GPU is much slower than CPU in this diffmpm.py example?

Question

why using GPU is much slower than CPU in this diffmpm.py example?

Opened this issue 2 years ago · 1 comments

In my mac m1, using CPU(arm) diffmpm can running in 14FPS, but using GPU(metal) running much slower, only less then 2 FPS, also in 3080(cuda) ,is there any problem this compiler do optimization in IR level?

Answer 1 · 2022-08-19T02:36:20.000Z

Also reproduced on my Intel + nvidia GPU workstation.

CPU: i9-11900k
GPU: RTX3080

with ti.cpu: 13 FPS
with ti.cuda: 10 FPS

Script: examples/diffmpm.py