Jeff-sjtu/HybrIK

Problem occurs when running the evaluation process on 3DPW

Closed this issue · 10 comments

I run the evaluation as instructed by ./scripts/validate_smpl_cam.sh ./configs/256x192_adam_lr1e-3-hrw48_cam_2x_w_pw3d_3dhp.yaml ./pretrained_hrnet.pth. I use the pretrained model "hybrik_hrnet48_w3dpw.pth" as "pretrained_hrnet.pth" like running the demo (and it worked successfully). The evaluation script works well on Human3.6M, but it fails on 3DPW. Here is the log:

Namespace(cfg='./configs/256x192_adam_lr1e-3-hrw48_cam_2x_w_pw3d_3dhp.yaml', checkpoint='./pretrained_models/hybrik_hrnet.pth', gpus='0', batch=32, flip_test=True, flip_shift=False, rank=0, dist_url='tcp://127.0.1.1:23457', dist_backend='nccl', launcher='pytorch', world_size=1)
tcp://127.0.1.1:23457, ws:1, rank:0
Loading model from ./pretrained_models/hybrik_hrnet.pth...
##### Testing on 3DPW #####
  0%|                                                                                                                                                                                                                        | 0/1110 [00:46<?, ?it/s]
Traceback (most recent call last):
  File "/home/xuyiwen/HybrIK/./scripts/validate_smpl_cam.py", line 229, in <module>
    main()
  File "/home/xuyiwen/HybrIK/./scripts/validate_smpl_cam.py", line 168, in main
    mp.spawn(main_worker, nprocs=ngpus_per_node, args=(opt, cfg))
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/xuyiwen/HybrIK/scripts/validate_smpl_cam.py", line 216, in main_worker
    gt_tot_err = validate_gt(m, opt, cfg, gt_val_dataset_3dpw, heatmap_to_coord, opt.batch, test_vertice=True)
  File "/home/xuyiwen/HybrIK/scripts/validate_smpl_cam.py", line 94, in validate_gt
    gt_output = m.module.forward_gt_theta(gt_thetas, gt_betas)
  File "/home/xuyiwen/HybrIK/hybrik/models/HRNetWithCam.py", line 475, in forward_gt_theta
    output = self.smpl(
  File "/home/xuyiwen/miniconda3/envs/dnd/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/xuyiwen/HybrIK/hybrik/models/layers/smpl/SMPL.py", line 202, in forward
    vertices, joints, rot_mats, joints_from_verts_h36m = lbs(betas, full_pose, self.v_template,
  File "/home/xuyiwen/HybrIK/hybrik/models/layers/smpl/lbs.py", line 258, in lbs
    pose_offsets = torch.matmul(pose_feature, posedirs) \
RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x639 and 207x20670)

@Jeff-sjtu @biansy000 Could you please help me out?

It is quite strange, and I have never met the problem. I think the problem is due to the incorrect dimension of gt_thetas in

gt_output = m.module.forward_gt_theta(gt_thetas, gt_betas)

Can you output the dimension of gt_thetas when validating on Human3.6M and 3DPW ? Or as an easy alternative, you may directly set test_vertice=False to disable the calculation of PVE.

It is quite strange, and I have never met the problem. I think the problem is due to the incorrect dimension of gt_thetas in

gt_output = m.module.forward_gt_theta(gt_thetas, gt_betas)

Can you output the dimension of gt_thetas when validating on Human3.6M and 3DPW ? Or as an easy alternative, you may directly set test_vertice=False to disable the calculation of PVE.

Hey, thanks for your reply. It's (32, 216). And it works well when disabling the calculation of PVE.

I think you may try reshaping gt_thetas to be (32, 24, 9).

I think you may try reshaping gt_thetas to be (32, 24, 9).

It makes no difference. In "lbs.py" it seems that the pose is finnaly reshaped by rot_mats = batch_rodrigues( pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3])

I think you may try reshaping gt_thetas to be (32, 24, 9).

It makes no difference. In "lbs.py" it seems that the pose is finnaly reshaped by rot_mats = batch_rodrigues( pose.view(-1, 3), dtype=dtype).view([batch_size, -1, 3, 3])

Inspired by your advice, I assume that gt_thetas is already in the form of rotation matrix. I comment the transformation code and just make a view of gt_thetas as rotation_matrix by rot_mats = pose.view(batch_size, -1, 3, 3). It works well and gives plausible evaluation results on 3DPW close to the numbers in the paper.

Yes, that is the correct solution.

Yes, that is the correct solution.

BTW, do you have any plan to share the pre-processed datasets or code for pre-process for D&D project?

Yes, that is the correct solution.

Or could you please tell me the meaning of dict key of pre-processed data, such as the different of pose and thetas, shape and betas? I am very interested in "D&D".

@avegetablechicken Hi, sorry for late reply. pose and thetas, shape and betas have exact the same meaning.

Yes, that is the correct solution.

Or could you please tell me the meaning of dict key of pre-processed data, such as the different of pose and thetas, shape and betas? I am very interested in "D&D".

@avegetablechicken
Hello
Are you using a single GPU for evaluation?
If this is the case, can I ask you if there is anything that needs to be modified in the script?
thanks for your reply