The length of sequence is too short
Closed this issue · 4 comments
Hi, I am getting the following warning during training, but it doesn't seem to affect my training process. At the same time, I used 4
2080ti and set the batch_size to 2 for training. The loss did not converge. I am not sure whether it is caused by the above problem, so I would like to ask you the reason for the above warning.
Hi Yilin,
Thanks for reaching me.
The warning message is just for debugging purposes.
Unfortunately, I believe that the model will not converge due to the small batch size. We originally use 6*8=48 as the batch size. The model might converge with smaller batch size, but 2 is just too small.
Bests,
Tutian
Thanks so much for your reply. I wonder if you have done some statistics on the minimum batch_size required for this model. Due to hardware resource limitations, I could only get 8 pieces of 3090 with 8*24G memory here, and I'm not sure if I can reproduce this work.
I think it's enough. You can try starting with batch size 8*4.
Thanks so much~ I will close this issue : )