sair-lab/kervolution

Convergence

tolusophy opened this issue · 0 comments

This is an issue concerning the architectural arrangement.

In deeper layers, for example, in Resnet18, changing all convolutional layers to kervolutional layers (with degree 3), the network gives a NaN loss which suggests explosion. How do you arrange the architecture to give good results