TobiasLee/Chinese-Hip-pop-Generation

模型训练30epoch之后不收敛问题

fuweifu-vtoo opened this issue · 1 comments

您好,我按照您的配置,训练了30个epoch,训练的log如下:
pre-training generator... pre-training discriminator... Start adversarial training... epoch: 0 reward: 0.10155454874038697 epoch: 5 reward: 0.978083610534668 epoch: 10 reward: 0.9903022766113281 epoch: 15 reward: 0.993144416809082 epoch: 20 reward: 0.9884807586669921 epoch: 25 reward: 0.9824935913085937 epoch: 29 reward: 0.9786094665527344
训练到最后一个epoch,reward也只是0.97,和刚开始基本没差别?
训练过程中的avg也基本不变,请问avg代表的是loss吗?这个avg和博客中提到的reward和penalty有什么区别?

以我们的经验来看,reward 确实不会有太大的变化,最好的结果出现在前 5 轮以内。