训练时出现oom的问题
Lireanstar opened this issue · 5 comments
出现
| WARNING: ran out of memory, skipping batch. if this happens frequently, decrease batchsize or truncate the inputs to the model
当运行python train_transmitter.py时
出现,本地的配置是2080Ti X 2 12GB的显存
WARNING: ran out of memory, skipping batch. if this happens frequently, decrease batchsize or truncate the inputs to the model.
请问程序里可以指定GPU进行训练吗?我只看到gpu = 0 ,1 设置,但是都是在第一块gpu训练,还能改什么参数顺利train?
@Libincn-HNU Please read this line:
You could change batch_size
from 10
to 5
. It will work for your case.
Now it works,thanks!
[ time:17643.0s total_exs:649110 epochs:4.94 ] {'exs': 550, 'token_acc': 27.41, 'loss': 2.626, 'pred': 0.8491, 'ppl': 13.82}
by the way, can i use 2 cards to run this .py file?
@Libincn-HNU Yes, you could (but I have not tried it lol). Maybe you can go for ParlAI documentation to see how to train an agent on two cards.
Ok,thanks!