CUDA error: no kernel image is available for execution on the device
mateuszwosinski opened this issue · 1 comments
I get the following error while trying to train a model with inplace-abn:
~/.conda/envs/py37/lib/python3.7/site-packages/inplace_abn/functions.py in inplace_abn(x, weight, bias, running_mean, running_var, training, momentum, eps, activation, activation_param)
152 training=True, momentum=0.1, eps=1e-05, activation="leaky_relu", activation_param=0.01):
153 return InPlaceABN.apply(x, weight, bias, running_mean, running_var,
--> 154 training, momentum, eps, activation, activation_param, None)
155
156
~/.conda/envs/py37/lib/python3.7/site-packages/inplace_abn/functions.py in forward(ctx, x, weight, bias, running_mean, running_var, training, momentum, eps, activation, activation_param, group)
83
84 # Update running stats
---> 85 count_ = count.to(dtype=var.dtype)
86 running_mean.mul_((1 - ctx.momentum)).add_(ctx.momentum * mean)
87 running_var.mul_((1 - ctx.momentum)).add_(ctx.momentum * var * count_ / (count_ - 1))
RuntimeError: CUDA error: no kernel image is available for execution on the device
Those are my settings:
GPU Device: Tesla V100-SXM2-16GB
GPU mounted at: cuda:0
PyTorch Version: 1.4.0
Torchvision Version: 0.5.0
CUDA version: 10.0
Any suggestions how to solve it? I have no problems with training models without inplace-abn.
@mateuszwosinski The error you are encountering means that InPlace ABN was compiled for a different GPU architecture than the one you are using. Are you installing InPlace ABN from a different machine than the one you are using for training, by any chance?
In any case, a possible solution for your issue should be to re-install after appropriately setting the TORCH_CUDA_ARCH_LIST
environment variable, e.g. by running: TORCH_CUDA_ARCH_LIST="7.0" pip install inplace-abn
.