pytorch/extension-cpp

Custom CUDA operator only work well on cuda:0

YuxianMeng opened this issue · 3 comments

Hi! I'm learning how to write custom cuda operator by your tutorial.

However, I found the operator only output correct results on "cuda:0", and output all_zeros tensor in other devices like "cuda:1". What I did is just change this line to set device to "cuda:1")

I adapted the tutorial for my own code and ran into the same issue- did you find a fix for it?

@YuxianMeng I found a solution that works (it was buried inside closed issues) https://discuss.pytorch.org/t/c-cuda-extension-with-multiple-gpus/91241/6