Can-no opened this issue 2 years ago · 0 comments
i don't understand. Why is episode_rewards negative when running ant_v3 with PPO?
besides,there are some parameter definitions that I don't quite understand, such as --gail。
Can anyone help explain this? thank you very much