devendrachaplot/Object-Goal-Navigation

Potential inconsistency in metric and reward computation

srama2512 opened this issue · 0 comments

Why does starting_distance in ObjectGoal_Env includes the object_boundary distance?

self.starting_distance = self.gt_planner.fmm_dist[self.starting_loc]\
/ 20.0 + self.object_boundary
self.prev_distance = self.starting_distance

The shortest path should only reach the object boundary right? Not the object itself. This also propagates into the reward for action 0 since prev_distance includes the object boundary, but curr_distance does not.

self.curr_distance = self.gt_planner.fmm_dist[curr_loc[0],
curr_loc[1]] / 20.0
reward = (self.prev_distance - self.curr_distance) * \
self.args.reward_coeff

The object_boundary should be not be added to starting_distance. Happy to send a PR if this makes sense.