Sonic A2C not working for Pong

Question

Sonic A2C not working for Pong

slerman12 opened this issue 5 years ago · 5 comments

I'm trying to test whether the A2C code for Sonic could be used to train an agent on another environment. I replaced the Sonic environments with 8 copies of Pong, and I varied up the number of epochs and mini batches and nsteps, but no matter what, I could not get it to learn Pong. Is there a reason this implementation won't train on Pong? Am I missing some important parameter? Could you test it for yourself and let me know? All I had to do was change the environments in agent.py with a Pong make_env() that used frame stacking and preprocessing.

Answer 1 · 2019-03-24T10:15:30.000Z

Hi, how many episodes did you run? And may I know your total reward for each episode?

Answer 2 · 2019-03-25T12:49:15.000Z

If I recall, 100 updates on the default settings was not enough to make any progress. The reward did not go up from -20 per episode.

Answer 3 · 2019-03-26T02:35:33.000Z

Yes, the situation is very similar. The rewards are around minus 20 for each episode. I think it is because 100 updates are far not enough. We need to train at least 1000 episodes. Train on GPU will be better. Good luck!

…

------------------------原始邮件------------------------ 发信人：Sam Lerman<notifications@github.com> 时间：05:49:20 上午收信人：simoninithomas/Deep_reinforcement_learning_Course<deep_reinforcement_learning_course@noreply.github.com> 抄送：2590477658<tyypz@sina.com>,Comment<comment@noreply.github.com> 标题：Re: [simoninithomas/Deep_reinforcement_learning_Course] Sonic A2C not working for Pong (#48) If I recall, 100 updates on the default settings was not enough to make any progress. The reward did not go up from -20 per episode. —You are receiving this because you commented.Reply to this email directly, view it on GitHub, or mute the thread.

Answer 4 · 2019-03-26T13:32:12.000Z

That surprises me, since the trained Sonic model required only 270 updates. That’s already processing millions of states, which should be enough for Pong, shouldn’t it?

Answer 5 · 2019-03-26T19:11:13.000Z

I'll try to run 1000 updates and get back to you. What if it still doesn't play Pong then? I'm hoping to use this as a baseline for my research with transfer learning. Would you not recommend that?