lcswillems/rl-starter-files

MaxPool2d preventing learning on BabyAI/MiniGrid

maximecb opened this issue · 6 comments

Dima just found an issue in the BabyAI code where using max pooling was preventing the model from learning some things. It's probably not a good idea to use max pooling to downscale twice since the input is only 7x7. I think the code might perform better (at least on BabyAI/MiniGrid), if the max pooling layers were removed:

https://github.com/lcswillems/rl-starter-files/blob/master/model.py#L30

Thank you Maxime for taking time to fill this issue.

Are you sure that the increase in performance not only due to the fact that removing the maxpooling increases the number of parameters?

Anyway, I will remove it

Yes, it seems we put too much max pooling. The input is tiny (7x7) so if we do max pooling twice, we're left with almost no signal.

I have just removed the MaxPool2D!

Finally, I have just reverted the modification. Removing the MaxPool2D will increase the number of parameters and so improve learning but it also makes learning much slower (for some networks, it multiplies the number of params by 100).

Could you clarify that on which games it's better to remove maxpool2d and on which games we should keep it?

I think it is generally better, it just takes much more time. But you are free to change the model as you want!