facebookresearch/frankmocap

ques about reimplemention

lvZic opened this issue · 3 comments

lvZic commented

i tried to implement the hand mesh reconstruction with MANO model. The process is just same as your work and others etc <3D Hand Shape and Pose from Images in the Wild>. However, i met 2 problem which need ur help.

  1. the preprocession of 3D loss is undefined on the internet, at least i could't find. I tried by gt_3d * 1e-3 to convert from mm to m, reorder the joints order between MANO and dataset, subtract wrist position, and divide all the 3d potisons by the middle finger mcp lentgh. However ,the training never converge! and the predicted joint is totally false as following:
    image
    image
    I also used mean/std normlization for both gt and pred 3d joint, which seems have better converge. Can u tell me which part i missed about 3d loss calculation?

  2. I replace the HMR netework by mobilenet v3, and train with the pretrained model by imagenet. However, the loss becomes larger and the training converge badly ,while all the hyper parameters remains unchange. I found mobilenet network can work for hand reconstruction <MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image>. Can u tell me where i missed?

@lvZic Thanks for your interest in our work.

  1. I think the bad results you showed here are not related to the backbone you use. There should be some hidden bugs in your code.
  2. The data preprocess pipeline you described looks OK to me. I would suggest you carefully check each step, visualize the intermediate results, to see if there exists any bugs. For training, you can refer to this repo. Although the algorithm differs, I believe the basic ideas of data preprocess and model training are the similar. A work codebase would be a good start for you.
lvZic commented

@lvZic Thanks for your interest in our work.

  1. I think the bad results you showed here are not related to the backbone you use. There should be some hidden bugs in your code.
  2. The data preprocess pipeline you described looks OK to me. I would suggest you carefully check each step, visualize the intermediate results, to see if there exists any bugs. For training, you can refer to this repo. Although the algorithm differs, I believe the basic ideas of data preprocess and model training are the similar. A work codebase would be a good start for you.

i fixed some bug and then found the result seems good if i use resnet as backbone. On the contrary, result using mobilenet v3 seems never converge. All the hyper parameters reamains unchanged. I guess if there shoule be some strategies to train small network such as mobilenet v3?

@lvZic
Thanks for sharing the updates. If the ResNet converges while MobileNet does not, then it may be the parameter tuning issue.