How to play all stages in `SuperMarioBros-v0` but using models trained in separate stages?
terryzhao127 opened this issue · 1 comments
Is your feature request related to a problem? Please describe.
I've trained models to play through the game. However, there are 32 models to play the 32 separate stages. I cannot let the trained agent play in the SuperMarioBros-v0
from the first stage to the last stage. In other words, is there any way to make the trained agent play in a environment with no any skipped frames, while the should-be-skipped frames can be implied by info
. Then, a complete game can be rendered from the start screen to the last frame that Mario saves the princess.
Describe the solution you'd like
The modified environment should be run like this:
while True:
if not should_skip(info):
action = model(state)
else:
action = 0
state, reward, done, info = env.step(action)
env.render()
Describe alternatives you've considered
Actually I've tried two ways to solve the problem:
- Add a
_should_skip_during_steps
function intosmb_env.py
, which is called by_get_info
.
In this method, the agent doesNOOP
every time ashould-be-skipped
step is found. And the_should_skip_during_steps
is implemented using skip conditions located inself._skip_xxx()
functions. However, theNOOP
skip is not the same way theself._skip_xxx()
functions do. More frames are skipped, then the first frame after the skipped frames is not the excatly same as that is returned by a plainSuperMarioBros-v0
environment. The little trivial difference is non-trivial to the reinforcement model, which leads to failure in the gameplay afterwards. - Ignore the skipped frames. Just play in the
SuperMarioBros-v0
but change the loaded model every time a new stage is detected ininfo
.
In this method, the agent starts to fail after entering1-2
from1-1
. Apparently the begining state inSuperMarioBros-1-2-v0
is not the same as that is returned in theSuperMarioBros-v0
after skipped frames.
Can anyone help me run the code on Windows 10 i am unable to run it which file is the main i have satisifed all the requirements!