details of hyper parameters in training on smallnorb

Question

details of hyper parameters in training on smallnorb

lithostark opened this issue 4 years ago · 1 comments

Hi, very interesting idea and solid work. Also thank you for sharing the code.

I'm trying to train your capsule network on smallnorb. However, the accuracy drops after epoch 70 and the loss becomes nan after epoch 200. So the highest testing acc I got is 95.7%.

I'm wondering if this is caused by the missing of lr_scheduler in the implementation.

In the paper, you mentioned that exponential decay was used but I could not find the value of gamma. Could you provide the details to train your model on smallnorb? Appreciate it a lot!

Answer 1 · 2021-07-15T14:41:40.000Z

Hi, apologies for the delayed response!

Simply training the 1-layer tinycapsnet model with all defaults should already get you to around 97% test acc on smallNORB as mentioned in the readme file. Note that this is achieved by training on the complete smallNORB training set.

As for the lr schedule, I followed the EM routing paper's schedule, and the official implementation can be found here:

https://github.com/google-research/google-research/tree/master/capsule_em

Hope that helps!
Fabio