NVlabs/ProtoMotions

Training full body tracker with h1

VineetTambe opened this issue · 3 comments

Hey Authors,

Thanks for making the code open source!
I am trying to train a full body tracker policy for the unitree h1
I am using the data provided in the repo.

This is the command I used to train the policy:

PYTHON_PATH phys_anim/train_agent.py +exp=h1_full_body_tracker +robot=h1_extended_hands +backbone=isaacgym motion_file=phys_anim/data/motions/h1_extended_hands_punch.npy

I want to test how good this policy is and wanted to test it. Is it possible to run eval on this policy? Or do I have to train MaskedMimic to run eval?

I am trying to run the eval using the following command:

PYTHON_PATH phys_anim/eval_agent.py +robot=h1_extended_hands +backbone=isaacgym +motion_file=phys_anim/data/motions/h1_extended_hands_punch.npy +checkpoint=results/h1_full_body_tracker/last.ckpt +headless=False

The following is the mean_episode_length chart of the training:
Screenshot from 2024-10-29 10-48-42

The eval command for visually evaluating the policy should be like you wrote.
Do you encounter any issues running it?

If so, please add HYDRA_FULL_ERROR=1 and share the error report, I'll take a look and try and solve the issue.

When I run the eval command - I get a Seg fault. I copy pasted the exact command.

miniconda3/envs/MaskedMimic/lib/python3.8/site-packages/torch/functional.py:507: UserWarning: torch.meshgr$
d: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/
native/TensorShape.cpp:3549.)                                                                                           
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]                                                  
Not connected to PVD                                                                                                    
+++ Using GPU PhysX                                                                                                     
Physics Engine: PhysX                                                                                                   
Physics Device: cuda:0                                                                                                  
GPU Pipeline: enabled                                                                                                   
Creating ground plane                                                                                                   
Ground plane created                                                                                                    
Creating 1 environments... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00                                        
/home/vineet/1x/ProtoMotions/phys_anim/utils/motion_lib.py:154: UserWarning: To copy construct from a tensor, it is reco
mmended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.te
nsor(sourceTensor).                                                                                                     
  torch.tensor(key_body_ids, dtype=torch.long, device=device),                                                          
Loading motions from yaml/npy file                                                                                      
Loading 1/1 motion files: phys_anim/data/motions/h1_extended_hands_punch.npy                                            
Loaded 1 motions with a total length of 5.233s.                                                                         
Loaded 1 sub motions with a total trainable length of 5.233s.                                                           
/home/vineet/miniconda3/envs/MaskedMimic/lib/python3.8/site-packages/torch/nn/init.py:452: UserWarning: Initializing zer
o-element tensors is a no-op                                                                                            
  warnings.warn("Initializing zero-element tensors is a no-op")                                                         
/home/vineet/miniconda3/envs/MaskedMimic/lib/python3.8/site-packages/torch/nn/modules/transformer.py:286: UserWarning: e
nable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not Tru
e(use batch_first for better inference performance)                                                                     
  warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}
")                                                                                                                      
Loading model from checkpoint: /home/vineet/1x/ProtoMotions/results/h1_full_body_tracker/lightning_logs/version_1/last.c
kpt                                                                                                                     
Segmentation fault (core dumped)    

When I run with HYDRA_FULL_ERROR=1 it get's stuck on the line just before the seg fault.

It's stuck on loading the model? Are you able to debug step-by-step and see where exactly it fails?
I haven't seen this before, so would need some more info to help solve it.