Success registration for cube transfer is questionable.
alexander-soare opened this issue · 2 comments
alexander-soare commented
This registers as a success, but it shouldn't, given that the task is to transfer the cube from one arm to another.
eval_episode_1.mp4
Here is the script for reproduction:
import gym_aloha
import gymnasium as gym
import numpy as np
import imageio
from lerobot.common.utils.io_utils import write_video
kwargs = {
"obs_type": "pixels_agent_pos",
"render_mode": "rgb_array",
"max_episode_steps": 400,
"visualization_width": 384,
"visualization_height": 384,
}
env = gym.make("gym_aloha/AlohaTransferCube-v0", disable_env_checker=True, **kwargs)
with open("actions.txt", "r") as f:
actions = [[float(s) for s in l.strip().split(" ")] for l in f.readlines()]
env.reset(seed=1001)
frames = [env.render()]
for action in actions:
env.step(np.array(action))
frames.append(env.render())
write_video("output.mp4", np.stack(frames), fps=50)
See the actions.txt file attached.
actions.txt
aliberts commented
As per the original code, this is expected behavior from the reward:
if touch_left_gripper and not touch_table: # successful transfer
reward = 4
Should we change the reward to something like this:
self.success = np.full(total_steps, False)
...
if touch_both_grippers and not touch_table:
self.success[current_step] = True
if self.success[-10:].all():
reward = 4
alexander-soare commented
@aliberts thanks for picking this up.
I was thinking more like:
# using a list fixes a bug, that you probs would have caught anyway
# calling it something other than "success" makes it clear that it is not the true success conditioh
self._transient_success = []
...
# most important change is the condition
if touch_left_gripper and not touch_right_gripper and not touch_table:
self._transient_success.append(True)
else:
self._transient_success.append(False)
# Use simulation time (here 0.5 seconds) rather than timesteps.
if all(self._transient_success[-int(0.5/dt):]):
reward = 4
success = True
The condition is all I really meant to change, but then I got carried away :P
Regarding the higher level question of backwards compatibility vs correctness, maybe we can have both versions somehow available. But in any case, in the interest of making robotics really work IRL, IMO we should aim for correctness always.