Some question about RL
Opened this issue · 0 comments
wangqiim commented
Thanks for you work. I encountered some problems in the process of reproducing your skipnet again. I faced some problem. Can you give me some advice? After Supervised learning Resnet(with gate), When reinforce learning, the accuracy will sharp decline. when I debug the code, I find gate sample (RL-Policy gradient need this) will make bad influence in all bn layer and backward params update. How do you solve these problem or avoid them.