ndrplz/ConvLSTM_pytorch

How to add the dropout into the convlstm

Opened this issue · 5 comments

Sorry but I've added a dropout layer behind the convolution, and the memory of the GPUs explode. How could I solve this?

Hi @JokerCM ,

your description is a bit vague, further details would be helpful. A short working code snippet reproducing the problem would be the best.

Also, as sanity checks: make sure that you didn't inadvertently change the batch size, image size, GPU you're training on etc.

Hi @ndrplz ,
THX for replying me. Actually I was doing a regression with your code on one dimensional biomedical data not image. But I add one dimension to the input so I could the network code changeless. And the network worked but there's some overfitting problem, so I started thinking adding dropout layer into it. But I found that once I added the nn.Dropout layer after the self.conv layer you stated in the ConvLSTMCell the GPU's memory would explode. And once I delete the nn.Dropout layer, the memory occupation is only 1Gb or so. So, I wanna ask where should I put the dropout to.
Thanks again.

In the original paper, dropout was proposed as a regularization techniques for the fully connected layers. So if your task is classification I'd suggest you to add the dropout in the last fully connected layers.

Eventually, you should also be able to apply 2D dropout to drop random channels in activations from convolutional layers.

Also, did you take already a look to related threads on the web (e.g. this, this etc.)?

@ndrplz Thank you so much. Now I add the dropout before the full connected layer and the problem solved. The program is running normally. But I used nn.Dropout not 2D dropout because after adding one dimension to my biomedical data the matrix shape is [1,200] I'm afraid if I use 2D dropout the first dimension would fall. And thx again for showing me the last 2 link about the dropout eating memory. I'm seeing it now.

Glad it helped! Sure thing, if your activation is a 1D vector Dropout1D is certainly fine.