real-stanford/universal_manipulation_interface

Difference of training configs

Dingry opened this issue · 0 comments

Hi, thank you for sharing your inspiring work. I am wondering what is the primary policy difference between train_diffusion_unet_timm_umi_workspace and train_diffusion_unet_image_workspace? In my understanding, they both condition on visual and proprioception observations to predict robot actions. Aside from variations in training hyperparameters, are there any specific design features intended for the UMI task?