SivilTaram/Persona-Dialogue-Generation

训练时出现oom的问题

Lireanstar opened this issue · 5 comments

出现
| WARNING: ran out of memory, skipping batch. if this happens frequently, decrease batchsize or truncate the inputs to the model

当运行python train_transmitter.py时
出现,本地的配置是2080Ti X 2 12GB的显存
WARNING: ran out of memory, skipping batch. if this happens frequently, decrease batchsize or truncate the inputs to the model.
请问程序里可以指定GPU进行训练吗?我只看到gpu = 0 ,1 设置,但是都是在第一块gpu训练,还能改什么参数顺利train?

@Libincn-HNU Please read this line:

return 10, 1e-4, 'gpt_custom', 1.0

You could change batch_size from 10 to 5. It will work for your case.

Now it works,thanks!
[ time:17643.0s total_exs:649110 epochs:4.94 ] {'exs': 550, 'token_acc': 27.41, 'loss': 2.626, 'pred': 0.8491, 'ppl': 13.82}
by the way, can i use 2 cards to run this .py file?

@Libincn-HNU Yes, you could (but I have not tried it lol). Maybe you can go for ParlAI documentation to see how to train an agent on two cards.

Ok,thanks!