Normalized Actions has bugs

Question

Normalized Actions has bugs

Closed this issue 5 years ago · 3 comments

One should be careful in uncommenting the normalized actions wrapper, as one has to make sure to call _reverse_action() and _max_episode_steps has a typo and should not be a function, otherwise the following in main.py would not work: mask = 1 if episode_steps == env._max_episode_steps else float(not done)

This small bug caused a lot of headaches but the repo is super nice otherwise!

Answer 1 · 2019-06-27T04:20:18.000Z

True

The easiest way to use normalized actions would be to directly scale the actions by a factor of env.action_space.high[0]
Like it is done in these 2 repo's
https://github.com/sfujim/TD3
https://github.com/openai/spinningup/tree/master/spinup/algos/sac

And yes _max_episode_steps is not part of gym.ActionWrapper (I don't understand why I have used it there)
You can check how _max_episode_steps works:
https://github.com/openai/gym/blob/85a5372a19c0f35db2410e586cc9a32c4d94bf1a/gym/wrappers/time_limit.py
https://github.com/openai/gym/blob/239aaf14ce804c9ce5068bfb69590110ea8ef1be/gym/envs/registration.py

Answer 2 · 2019-06-29T08:19:06.000Z

@Phlogiston90
#13

Answer 3 · 2019-07-01T10:51:19.000Z

Thanks a lot! :-)