Convergence
tolusophy opened this issue · 0 comments
tolusophy commented
This is an issue concerning the architectural arrangement.
In deeper layers, for example, in Resnet18, changing all convolutional layers to kervolutional layers (with degree 3), the network gives a NaN loss which suggests explosion. How do you arrange the architecture to give good results