bfshi/DGAM-Weakly-Supervised-Action-Localization

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm

dongfengxijian opened this issue · 2 comments

When I run 'python train_all.py', I meet CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) as the following:
Traceback (most recent call last):
File "train_all.py", line 221, in
main()
File "train_all.py", line 164, in main
config.TRAIN.TEST_EVERY_EPOCH, 'rgb')
File "/data/phd/PycharmProjects/DGAM-Weakly-Supervised-Action-Localization-mas ter/lib/core/function.py", line 65, in train
attention = model(video_feature, 'att') # [batch_size, seg_num, 1]
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/data/phd/PycharmProjects/DGAM-Weakly-Supervised-Action-Localization-mas ter/lib/models/model.py", line 68, in forward
return self.att_head(x)
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/data/phd/PycharmProjects/DGAM-Weakly-Supervised-Action-Localization-mas ter/lib/models/model.py", line 39, in forward
x = self.fc1(x)
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/data/phd/Software/anaconda3/envs/DGAM/lib/python3.7/site-packages/torch /nn/functional.py", line 1372, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSge mm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
(DGAM) phd@amax:/data/phd/PycharmProjects/DGAM-Weakly-Supervised-Action-Localiza tion-master$
`
How can I solve this error?

bfshi commented

Which version of PyTorch are you using? Maybe you can try upgrade it to the latest version?
similar issue: allenai/allennlp#5064 (comment)

Which version of PyTorch are you using? Maybe you can try upgrade it to the latest version? similar issue: allenai/allennlp#5064 (comment)

Thank you for your recommendation. It is because that cuda10.1 does not match RTX3090. When I transfer cuda to 11.1, it workes normally. Thank you for your giudance again!