nan gradients with BCHW layout

Question

jj0mst opened this issue 8 years ago · 0 comments

I recently tried to use the new BCHW functions with my network, since i always use that layout and it simplifies my code.

I noticed that all the gradients of my convolutional layers are nan now, which makes also the weights full of nan after the parameters' update.

I'm sure i made all the necessary conversions and i get no error like inconsistency between tensors or anything else.