a issue with path following task
Closed this issue · 16 comments
In the path following task, terrain has no collision attributes. Is it an asset loading problem?
This shouldn't happen. Terrain is used the same in all tasks. If it works with one, should work with the others.
Could you share the exact command line you are running?
the command line is:
PYTHON_PATH phys_anim/train_agent.py +exp=path_follower motion_file=phys_anim/data/motions/smpl_humanoid_walk.npy +robot=smpl +backbone=isaacsim
I think there's a limitation with IsaacSim with triangle mesh sizes.
First, try running with the flag force_flat_terrain=True
. This will create a flat terrain with the minimal amount of triangles.
If this works, you can try editing the num_terrains (
terrain.config.num_terrains=X
. I believe 6 or 7 should probably work well.
Let me know if this helps and if there are any additional issues.
editing the num_terrains really works, thank you very much! Beside, if I want to train a policy like pacer, can I still train using the path-following task and just change my data yaml?
Yes. But there are some differences to keep in mind/edit.
- Our path-following task extends PACER to support not just XY but also Z (height) conditioning. You can disable this by changing . Since AMP requires the data distribution to match the distribution of behaviors expected in the solution -- if you want height-conditioning to work nicely, you should mix not just walking motions but also crouch-walking and potentially crawling motions.
- PACER uses a "naive" path follower. You can set this here: . By naive we mean that it ignores the correlation between height and speed. We limit the maximal speed based on the requested height.
- PACER has some additional regularization terms which we do not have. For example, a symmetry loss for more symmetric walking gait.
Thanks for the reply! now I use this command PYTHON_PATH phys_anim/train_agent.py +exp=path_follower motion_file=phys_anim/data/motions/smpl_humanoid_walk.npy +robot=smpl +backbone=isaacsim
for training. I only changed batch_size to 1024, and other configs are the same as you. However, after 5000 epoches, the policy doesn't work. Is there any problems with my configs or not enough training epoches?
Could you share the episode-length, task-reward, and jd-reward graphs from weights and biases?
Here's an example of what you would typically see for episode length with the height-conditioned PACER. It's not perfect after 5000 epochs, but it should successfully follow most of the paths.
The episode length should start at 0. This means it's mostly failing to keep track of the path. Over time it becomes more successful and the average episode length (before termination) increases.
I suggest using weights and biases, with the flag +opt=wdb
.
That will give you a way to live-track your training online.
You can then also directly share the run metrics.
2024-10-31.19.17.10.mp4
This is the policy I trained on the flat plane through the path-follower task provided by this repository. The policy shows a one-legged jumping posture. How can I adjust it?
I see the generated paths have changing heights. This setting typically requires more diverse data.
Can you try disabling as per this comment: #15 (comment)
Set height conditioned to false and the path generator to "naive" mode.
ok, I'll try it again according to #15(comment). Can motion_file be the smpl_humanoid_walk.npy you provided?
2024-11-01.09.52.30.mp4
After setting "height_conditioned=False" and "use_naive_path_generator=True", the policy is like the one shown in the video. With no height reward, the robot crawls on the ground to track the path.
what I think is the issue:
- AMP is a discriminative method. You can think of it as "solve the task I gave you in a style similar to the data distribution".
- If the data is very concentrated, for example a single forward-walk cycle, it's hard to generalize to new motions.
- Path following generates random speeds and direction changes.
What I'd suggest if you want visually pleasing solutions is to increase the data distribution. The AMASS dataset contains a huge amount of walking/running/turning data. You can then learn to generalize much better. This is also what was performed in the Trace & Pace paper.
I agree with you! Besides, I found a issue that the env doesn't reset when the robot falls down with height_conditioned=False
, which leads to this poor policy.
During training, do you see the reported episode length at 300 from the start? That would indicate that's indeed the case.
From what I see, the policy should terminate if the character deviates from the path by more than 4 meters.
You can also turn on env.config.enable_height_termination=True
. This should terminate if the hands touch the floor.
It's disabled in height-conditioned as that requires crawling motions.
As this seems be a recurring question -- let me know what worked well and I will upload an additional experiment file for path_following without height conditioning.