rohanpsingh/LearningHumanoidWalking

jvrc_step warning

gaiyi7788 opened this issue · 4 comments

Thank you for sharing the code. However, I have met some problems when I try to train the jvrc_step task.

I have tried such command and got the warning:

python run_experiment.py train --logdir logs --num_procs 1 --env jvrc_step

(_run_random_actions pid=50638) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0000.

Could you help me with why I get this problem? Thank you!

Hi @gaiyi7788

That's a bit strange. The warning indicates there's some spurious contacts when you do the training.
If it happens early-on during the training, one possible reason could be that the computed normalization parameters are bad.
Could you try with a higher number for --num_procs argument?

Dear @rohanpsingh,

Thank you for your timely reply! I tried setting --num_procs to 8 and found that the training does work, but the WARNING was still there.

********** Iteration 0 ************
(sample pid=6690) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0675.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0775.
(sample pid=6691) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.2175.
(sample pid=6685) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0475.
(sample pid=6686) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1575.
(sample pid=6687) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0900.
(sample pid=6686) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.3550.
(sample pid=6692) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1325.
(sample pid=6690) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1200.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0200.
(sample pid=6685) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1575.
(sample pid=6686) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1775.
(sample pid=6687) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0925.
(sample pid=6692) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.3725.
(sample pid=6690) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.2375.
(sample pid=6685) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.3475.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1200.
(sample pid=6686) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.2650.
(sample pid=6690) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1625.
(sample pid=6687) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1050.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.3325.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.1350.
(sample pid=6689) WARNING:absl:Too many contacts. Either the arena memory is full, or nconmax is specified and is exceeded. Increase arena memory allocation, or increase/remove nconmax. (ncon = 100) Time = 0.0925.
/home/cpy/projects/LearningHumanoidWalking/rl/algos/ppo.py:332: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:230.)
observations, actions, returns, values = map(torch.Tensor, batch.get())
Sampling took 7.16s for 3200 steps.
Optimizer took: 1.10s
| Return (batch) | 15.234 |
| Mean Eplen | 40 |
| Actor loss | -0.0249 |
| Critic loss | 12.5 |
| Mirror loss | 0.000201 |
| Mean KL Div | 0.0116 |
| Mean Entropy | 0.0811 |
| Clip Fraction | 0.161 |
Total time elapsed: 8.43s. Total steps: 3200 (fps=379.40)

Some additional details:
(1) I can successfully train the model in jvrc_walk env.
(2) I use the jvrc1.xml & scene.xml in https://github.com/rohanpsingh/jvrc_mj_description/tree/bfb2bf25966ddfe0d0248385ef7e2586963c86a0/xml for jvrc_step.
(3) I use the jvrc1_terrain.xml in https://github.com/rohanpsingh/jvrc_mj_description/blob/topic/rl_dev/xml/jvrc1_terrain.xml for jvrc_walk.
(4) My CPU is Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz.

Thank you for your help!

Hi @gaiyi7788

Thanks. I was able to reproduce the issue. Actually, the warning appeared simply because I had forgotten to overwrite the nconmax attribute to a higher value (as the warning message suggests) in recent changes. I pushed commit 4302b38 that should fix this.

Let me know if it still doesn't work :)

This is useful for me! Thank you very much.