Tranining problems
Closed this issue · 1 comments
Amazingren commented
Hey @wyhsirius,
I was training the model on 4gpus, Have you met the following problem:
-
When I directly train start from 0,
I can use batch_size=32 to train the model without any problem,
-
However, when I want to train the model with
--resume_ckpt
, it shows like below, and I can just use very small batch size to avoid theout of memory
problem :
I would appreciate it if you can share me some suggestion to solve this problem~
Bests,
Amazingren commented
Hey guys, if you have the same problem. just change
ckpt = torch.load(resume_ckpt)
to
ckpt = torch.load(resume_ckpt, map_location='cpu')
in the trainer.py file.