Custom CUDA operator only work well on cuda:0
YuxianMeng opened this issue · 3 comments
YuxianMeng commented
Hi! I'm learning how to write custom cuda operator by your tutorial.
However, I found the operator only output correct results on "cuda:0", and output all_zeros tensor in other devices like "cuda:1". What I did is just change this line to set device to "cuda:1")
iamgroot42 commented
I adapted the tutorial for my own code and ran into the same issue- did you find a fix for it?
iamgroot42 commented
@YuxianMeng I found a solution that works (it was buried inside closed issues) https://discuss.pytorch.org/t/c-cuda-extension-with-multiple-gpus/91241/6
YuxianMeng commented
@iamgroot42 Sorry for the late reply. I solved this issue with this post: https://discuss.pytorch.org/t/custom-cuda-operator-only-work-well-on-cuda-0/150504/7