Extremely poor validation results on self-trained checkpoints
Jessy-Huang opened this issue · 9 comments
Dear Tony.
Thank you for your excellent work on ALOHA, I have tried to reproduce your work in the Mujoco simulation environment, and based on your open source data, The success rate should be around 90% for transfer cube, and around 50% for insertion.
I have trained and validated on your open source dataset, and I get results of around 54% for transfer cube, and around 14% for insertion.The results are very dismal, and I have not changed any of the parameter settings, and I would like to know what are some of the reasons for such a problem, or if I need to optimise in any way. The exact data can be viewed in the table below
Here are my training parameter settings
python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000 --lr 1e-5 \
--seed 0
Here are my eval parameter settings
python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000 --lr 1e-5 \
--seed 0 \
--eval
The dataset obtained from the training is shown in the following link (Lark document, you need to register yourself, please note if you apply for permission to view)
https://iklxo6z9yv.feishu.cn/sheets/LRwosV4jnh7xokt8F8UcMT7snzh?from=from_copylink
I suspect the mujoco version is messing things up. I updated the requirements yesterday: 742c753
Could you try reinstalling these packages and evaluate the same checkpoints again? You would not need to retrain the policy.
I suspect the mujoco version is messing things up. I updated the requirements yesterday: 742c753
Could you try reinstalling these packages and evaluate the same checkpoints again? You would not need to retrain the policy.
I have changed the version of mujuco and dm-control according to your instructions, their version information is as follows
(aloha) ➜ act-main conda list | grep dm-control
dm-control 1.0.14 pypi_0 pypi
(aloha) ➜ act-main conda list | grep mujoco
mujoco 2.3.7 pypi_0 pypi
(aloha) ➜ act-main
The parameters used for model inference are as follows
python3 imitate_episodes.py \
--task_name sim_transfer_cube_scripted \
--ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \
--policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \
--num_epochs 2000 --lr 1e-5 \
--seed 0 \
--eval
The result of this reasoning is as follows
Success rate: 0.54
Average return: 318.9
Reward >= 0: 50/50 = 100.0%
Reward >= 1: 40/50 = 80.0%
Reward >= 2: 35/50 = 70.0%
Reward >= 3: 27/50 = 54.0%
Reward >= 4: 27/50 = 54.0%
policy_best.ckpt: success_rate=0.54 avg_return=318.9
Does it need to be retrained or are there any other possibilities that led to a bad result?
In addition, by observing all the failed videos, I found that all the failures were caused by the right robotic arm failing to catch the square when grasping it leading to the failure of the final exchange of the square, is it necessary to introduce a model of object detection during grasping to ensure that the robotic arm can accurately grasp the object? I've put some of the failed gripping videos in the link below.
video5.mp4
video7.mp4
Dear Tony. Thank you for your excellent work on ALOHA, I have tried to reproduce your work in the Mujoco simulation environment, and based on your open source data, The success rate should be around 90% for transfer cube, and around 50% for insertion. I have trained and validated on your open source dataset, and I get results of around 54% for transfer cube, and around 14% for insertion.The results are very dismal, and I have not changed any of the parameter settings, and I would like to know what are some of the reasons for such a problem, or if I need to optimise in any way. The exact data can be viewed in the table below
Here are my training parameter settings
python3 imitate_episodes.py \ --task_name sim_transfer_cube_scripted \ --ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \ --policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \ --num_epochs 2000 --lr 1e-5 \ --seed 0
Here are my eval parameter settings
python3 imitate_episodes.py \ --task_name sim_transfer_cube_scripted \ --ckpt_dir ~/data/aloha/act/sim_transfer_cube_scripted/ckpt/ \ --policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 8 --dim_feedforward 3200 \ --num_epochs 2000 --lr 1e-5 \ --seed 0 \ --eval
The dataset obtained from the training is shown in the following link (Lark document, you need to register yourself, please note if you apply for permission to view) https://iklxo6z9yv.feishu.cn/sheets/LRwosV4jnh7xokt8F8UcMT7snzh?from=from_copylink
I have the same prolem, have you solved it? Thanks.
@Jessy-Huang @z-yf17 Have you solved the problem?
I am experiencing a similar problem where the success rates are too low. How did you solve the problem? Any advice would be appreciated. Thank you!
我也遇到类似的问题,成功率太低。你是如何解决这个问题的?任何意见,将不胜感激。谢谢!
You can try the following parameters, in my case the success rate is 100%.
python imitate_episodes.py --task_name sim_transfer_cube_scripted --ckpt_dir ckpt_dir_batchsize_16_epoch_4000_TG --policy_class ACT --kl_weight 10 --chunk_size 100 --hidden_dim 512 --batch_size 16 --dim_feedforward 3200 --num_epochs 4000 --lr 2e-5 --seed 0 --eval