Running AGGCN but came across an error (cuda runtime error (38))

Question

Running AGGCN but came across an error (cuda runtime error (38))

hogunpark opened this issue 5 years ago · 2 comments

Hello. Thank you for sharing your code.

My environment is Python 3.6.8, PyTorch 0.4.1, and CUDA 9.0.

I got errors as below:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train.py", line 119, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/trainer.py", line 67, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 22, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 47, in init
self.gcn = AGGCN(opt, embeddings)
File "/code/AGGCN/model/aggcn.py", line 129, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/code/AGGCN/model/aggcn.py", line 205, in init
self.weight_list = self.weight_list.cuda()
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in cuda
return self._apply(lambda t: t.cuda(device))
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply
module._apply(fn)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 191, in _apply
param.data = fn(param.data)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp:74

My GPU is empty, and all other example codes on GPU is fine.
Have you come across the error? Thank you.

Answer 1 · 2019-12-04T20:19:19.000Z

In the running script, the CUDA ID should be changed accordingly. Issue closed.

Answer 2 · 2021-03-07T12:46:06.000Z

In the running script, the CUDA ID should be changed accordingly. Issue closed.

Where is the Cuda ID specified?