seungeunrho/minimalRL

cartpole ppo train , reward drop

SeungyounShin opened this issue · 1 comments

if you train ppo far enough likes 3000 episodes or more, rewards got dropped. (like 500 to 30)

@SeungyounShin
PPO is an on-policy algorithm. When you update agent using highly correlated trajectories can makes agent worse. I fixed that #45 for you.