declare-lab/Multimodal-Infomax

Why does this issue occur when the GPU is available and the environment is configured properly?

PhilrainV opened this issue · 1 comments

torch==1.7.1
cuda==11.0
cuda.is_available==True
Start loading the data....
train
Training data loaded!
valid
Validation data loaded!
test
Test data loaded!
Finish loading the data....
[W ..\torch\csrc\autograd\python_anomaly_mode.cpp:104] Warning: Error detected in LogdetBackward. Traceback of forward call that caused the error:
File "main.py", line 62, in
solver.train_and_eval()
File "D:\Project\CSCL\paper_test\Main_structure\src\solver.py", line 282, in train_and_eval
train_loss = train(model, optimizer_main, criterion, 1)
File "D:\Project\CSCL\paper_test\Main_structure\src\solver.py", line 158, in train
bert_sent, bert_sent_type, bert_sent_mask, y, mem)
File "D:\CODE_env\Anaconda\anaconda3\envs\CSCL\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Project\CSCL\paper_test\Main_structure\src\model.py", line 107, in forward
lld_ta, ta_pn, H_ta = self.mi_ta(x=text, y=acoustic, labels=y, mem=mem['ta'])
File "D:\CODE_env\Anaconda\anaconda3\envs\CSCL\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Project\CSCL\paper_test\Main_structure\src\modules\encoders.py", line 193, in forward
H = 0.25 * (torch.logdet(sigma_pos) + torch.logdet(sigma_neg))
(function print_stack)
Traceback (most recent call last):
File "main.py", line 62, in
solver.train_and_eval()
File "D:\Project\CSCL\paper_test\Main_structure\src\solver.py", line 282, in train_and_eval
train_loss = train(model, optimizer_main, criterion, 1)
File "D:\Project\CSCL\paper_test\Main_structure\src\solver.py", line 189, in train
loss.backward()
File "D:\CODE_env\Anaconda\anaconda3\envs\CSCL\lib\site-packages\torch\tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "D:\CODE_env\Anaconda\anaconda3\envs\CSCL\lib\site-packages\torch\autograd_init
.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cusolver error: 7, when calling cusolverDnCreate(handle)

I never met this before and it seems a pytorch internal issue. You can approach to pytorch forums for the possible solution.