Question and suggest on reward function for simulator
Closed this issue · 2 comments
tikisi commented
The reward function for simulator at the end of an episode is different from JetRacer's, so I think it would be more natural to make them the same like following.
# JetRacer
return config.reward_reward_crash() - (config.reward_crash_reward_weight() * norm_throttle), done
# Simulator (current)
return config.reward_reward_crash() + config.reward_crash_reward_weight() * (self.speed / 18.0)
# Simulator (propose)
return config.reward_reward_crash() - config.reward_crash_reward_weight() * (self.speed / 18.0)
If there are some reasons for this difference, I want to know.
masato-ka commented
Thank you for your suggest. Yes, it is like seem correct. I will fix this bug in next release.
masato-ka commented
Fix in v1.7.1(1.7.0)