About the ALE settings
ppwwyyxx opened this issue · 5 comments
I have some questions in mind about the specific setup of the environment. I'm not sure did you check with the authors on these choice.
- repeat_action_probability: The ALE Manual strongly suggests using the default 0.25. Is 0.0 a reasonable choice? Will 0.0 make it easier to learn?
- treat_life_lost_as_terminal: This option would definitely make things much easier. Did the original paper use a similar setup?
Btw you're not using the frame_skip parameter anywhere but a magic number 4. You might want to fix that.
Great work!
I didn't check with the authors on my ALE settings. I guess they used the same settings with their Nature paper of DQN, so I'm mimicking them. I agree that these settings make learning easier.
Thanks. So I just checked their alewrap and treat_life_lost_as_terminal seems like what they've always been using. Didn't find anything about the repeat_action_probability though.
repeat_action_probability
is introduced recently (from ALE 0.5.0), after their DQN paper. So it should be turned off to reproduce their results. See the discussion below:
https://groups.google.com/forum/#!topic/deep-q-learning/p4FAIaabwlo
FYI, it is confirmed in two latest papers 1, 2 by @mgbellemare (author of ALE) that these two options have always been in easy-mode (as you did). He started to use hard-mode in these two papers.
Jumping in, in our latest paper [2 above] we found the life loss signal to be detrimental. The repeat action prob. affects the original DQN performance significantly, but more recent algorithms (such as Double DQN or our own) don't suffer so much from it.