about the learning rate
Closed this issue · 4 comments
In this code, the parameters of fusion_weights , fusion_bias is not contain in optimizer, Don't they need to be learned? thanks !
Good point... It's been quite a while since I worked on this project. As I recall there should not be a reason to exclude those two parameters from the optimizer. This is in fact a PyTorch reimplementation of our model (which was originally written in Keras), I might've made a mistake during reimplementation.
Feel free to try to include those two parameters in the training. If you could also share your results about performance between incl. and excl. those two params I'd very much appreciate that - meanwhile, I'll try to dig a bit into our original code and see if we have this somewhat weird setting there as well.
@Justin1904
I just want to confirm whether these two parameters need to be learned. I did not run the data in the original paper. Thanks
Yes, set them as learnable in the worst case should not hurt performance, so I recommend that.
@Justin1904 Thanks!