ICDAR2015 training RuntimeError
jwnsu opened this issue · 5 comments
jwnsu commented
Got following error, training with 1 GPU (ubuntu 16.04, pytorch 1.1/cuda10, 1080ti):
File "./train.py", line 93, in <module>
main(config, args.resume)
File "./train.py", line 60, in main
trainer.train()
File "/home/dsu/ai/fots/base/base_trainer.py", line 79, in train
result = self._train_epoch(epoch)
File "/home/dsu/ai/fots/trainer/trainer.py", line 90, in _train_epoch
training_mask)
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/dsu/ai/fots/model/loss.py", line 90, in forward
recognition_loss = self.recognition_loss(y_true_recog, y_pred_recog)
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/dsu/ai/fots/model/loss.py", line 61, in forward
loss = self.ctc_loss(pred[0], gt[0], pred[1], gt[1])
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1332, in forward
self.zero_infinity)
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/functional.py", line 1813, in ctc_loss
zero_infinity)
RuntimeError: Tensor for argument #2 'targets' is on CPU, but expected it to be on GPU (while checking arguments for ctc_loss_gpu)
Has anyone encountered this error? Thanks.
CPU training seems to work fine (but very slow).
ps: multiple-gpu training encountered a different error:
File "./train.py", line 93, in <module>
main(config, args.resume)
File "./train.py", line 60, in main
trainer.train()
File "/home/dsu/ai/fots/base/base_trainer.py", line 79, in train
result = self._train_epoch(epoch)
File "/home/dsu/ai/fots/trainer/trainer.py", line 74, in _train_epoch
mapping)
File "/home/dsu/p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 142, in forward
for t in chain(self.module.parameters(), self.module.buffers()):
AttributeError: 'FOTSModel' object has no attribute 'buffers'
novioleo commented
@jwnsu the first error,i think you should check your ground truth,have it load in gpu? have you use your_cuda_gt=your_gt.cuda()
?
the second one,i will reply later after confirmation
jwnsu commented
have tried moving gt to cuda, got following error:
File "/home/dsu/ai/fots/model/loss.py", line 61, in forward
gt = gt.cuda()
AttributeError: 'tuple' object has no attribute 'cuda'
novioleo commented
@jwnsu
i think you need to read the error message,gt is a tuple,it contains many fields,you only just set the specific field value to cuda..
jwnsu commented
thx for response, it now works fine.
feitiandemiaomi commented
@jwnsu How did you solve the second err?