DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1)

Question

DEMO Error : can't run ref_vertices.expand(batch_size, -1, -1)

Closed this issue 2 years ago · 10 comments

Hi , I really liked your paper and wanted to try out the demo. I run the exact line from DEMO.md and got the following error:

Traceback (most recent call last):
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 302, in <module>
    main(args)
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 296, in main
    run_inference(args, _bstro_network, smpl, mesh_sampler)
  File "/home/oscar/Workspace/bstro/bstro/./metro/tools/demo_bstro.py", line 88, in run_inference
    _, _, pred_contact = BSTRO_model(images, smpl, mesh_sampler)
  File "/home/oscar/anaconda3/envs/bstro2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/oscar/Workspace/bstro/bstro/metro/modeling/bert/modeling_bstro.py", line 203, in forward
    ref_vertices = ref_vertices.expand(batch_size, -1, -1)
RuntimeError: The expanded size of the tensor (1) must match the existing size (30) at non-singleton dimension 0.  Target sizes: [1, -1, -1].  Tensor sizes: [30, 431, 3]

My setup:
Python 3.10.5
Pytorch 1.11.0
torchvision 0.12.0
cuda 11.3.1

Answer 1 · 2022-07-24T20:27:45.000Z

Hi,

Unfortunately I didn't encounter the errors you have. Before line 203, the tensor ref_vertices's shape should be [1, 431, 3], which will be expand to [batch_size, 431, 3] by ref_vertices = ref_vertices.expand(batch_size, -1, -1).

Tensor sizes: [30, 431, 3] looks like batch_size=30 but the current demo code supports only batch_size=1, which is kinda perplexing.

Aligning the dependencies' versions may help.

Thanks!

Answer 2 · 2022-07-28T16:38:47.000Z

Hi,
thanks for answering. I managed to make it run, forcing some dimensions to 1. There is a pseudo-batch_size = 30 all over the repository. When I set those to 1 everything worked well.

I have an other question, for the demo we run bstro with those arguments:

--num_hidden_layers 4
--num_attention_heads 4

It seems like for hidden_layers and attention_heads we can go up to 12. Is it possible to change those arguments and still run the demo? Which architecture gave the best results during your benchmarks?

Answer 3 · 2022-08-04T13:14:21.000Z

Hi,
Having the same issue here. I tried changing the batch_size parameters as you said. still receiving the same error. Can you share you fork which is working?
Thanks

Answer 4 · 2022-08-05T10:06:42.000Z

Hi,
For the "debatchification" look at the last two commits on my forked repo. Code is ugly but working for me, also I did not manage to display the outputs aside so I directly render them with trimesh.

https://github.com/oscarfossey/bstro

Answer 5 · 2022-08-05T16:27:07.000Z

Hi,
a few quick reply below:

I get a chance to test the whole installation on another clean-state machine from scratch, following the docs/INSTALL.md. The code still runs w/o getting this pseudo-batch_size = 30 error. My best guest now is this could be an env-dependent issue.
Will try @oscarfossey's setting next time. Can @hosseinfeiz also share the setting as well?
Before we have a generic solution, I'll put a FAQ and point to @oscarfossey's reply above. Is this ok?
I didn't experiment with different --num_hidden_layers and --num_attention_heads. These parameters are the same as the setup in METRO.

Answer 6 · 2022-08-05T16:31:22.000Z

Hi,

Okey for me, thanks for the precise answers.

Answer 7 · 2022-08-05T16:35:06.000Z

and forgot one point:
3. contact_vis.obj is the final visualization generated by the code. This image is made by putting it in MeshLab, taking a screenshot and putting them side by side. I should be more clear in the instruction.

Answer 8 · 2022-08-08T11:49:47.000Z

I also meet this problem. As for me, the reason is that the smpl shapedir's dimension is 300 instead of 10.
So the easy solution is changing self.shapedirs.view(-1,10) to self.shapedirs[:,:, :10].view(-1,10) in https://github.com/paulchhuang/bstro/blob/main/metro/modeling/_smpl.py#L74

Answer 9 · 2022-08-12T15:45:19.000Z

Hi, I confirm I can reproduce the reported error using 300-shapedir smpl model, and @qinb's workaround solves the issue. In a nutshell, 300 after view(-1, 10) just leads to a virtual batch_size 30=300/10.

I was following the instructions in METRO and prepared the instructions in BSTRO in the same way. Didn't expect users to re-use their existing SMPL model files. If @hosseinfeiz @oscarfossey can confirm this addresses their issue, I can quickly push a hot fix and big thanks to @qinb for the pointer!

Answer 10 · 2022-08-13T04:04:51.000Z

@paulchhuang Hi, Could you give some advice for your other repo? muelea/selfcontact#9
1、The speed is too slow when running run_selfcontact_optimization.py, excepting decreasing the parameter - maxiter, what has the other method?
2、ProHMR serves as a pose estimator, does the repo of selfcontact merge together with ProHMR for joint training? for examples, selfcontact serves as loss supervisor?
Thanks again