rail-berkeley/rlkit

Final Distance in Push

MianchuWang opened this issue · 4 comments

Hi, Vitchyr

When I use env.sample_goal() in the Push environment, it returns a dict that includes desired_goal . desired_goal is a 4-D array, where the first 2 numbers are the position of the hand and the last 2 numbers are the position of the puck.

When I use env.step(any_action) , the returned state is a dict that includes achieved_goal , achieved_goal has the same structure as the abovementioned desired_goal .

My question is
Final_distance = hand_distance + puck_distance = Euclidean(achieved_goal[0:2], desired_goal[0: 2]) + Euclidean(achieved_goal[2:4] + desired_goal[2:4])

Is the equation correct?

I'm sorry that I didn't find a similar snippet in your implementation, so I ask you here.

Thank you

Hi Mianchu,

Which "final_distance" are you referring to? The env returns both the hand_distance and the puck_distance. If you're referring to the metric used in e.g. the Skew-Fit paper, we report just the puck distance, since the hand distance is pretty easy to optimize.

Vitchyr

Hi Mianchu,

Which "final_distance" are you referring to? The env returns both the hand_distance and the puck_distance. If you're referring to the metric used in e.g. the Skew-Fit paper, we report just the puck distance, since the hand distance is pretty easy to optimize.

Vitchyr

Hi Vitchyr,

Thanks for your reply. I'm referring the "Final Distance to Goal" in RIG, like the y-axis in figure 3.

I'm sorry that I closed the issue by accident.

Mianchu

For RIG I believe we actually just reported Euclidean(achieved_goal, desired_goal) and didn't compute the sums separately. Qualitatively the results are the same.

For RIG I believe we actually just reported Euclidean(achieved_goal, desired_goal) and didn't compute the sums separately. Qualitatively the results are the same.

Thank you, it solves my problem!