TerenceCYJ/S2HAND

Question about GT and training parameters

Closed this issue · 4 comments

Hello Yujin,

Thank you for sharing the great work!

I'm confused about the generation of pseudo masks in ground truth and training.

  1. From this issue, some masks of samples in GT are not quite accurate and even complete black. How do you separate them, and why the results of some segmented GT are unacceptable?
  2. I noticed that in your code, you used textures whose shape were like (faces.shape[0], faces.shape[1], texture_size, texture_size, texture_size, 3). What's the mean of three 'texture_size'? And will a bigger texture_size induces better rendered RGB images?

Thanks!

Hi.

  1. The mask here is from the rendered silhouette of the estimated mesh, so if the estimated mesh is not accurate, so will the mask.
  2. We use texture_size=1 for each mesh face, it may get better results if with a more detailed texture for each face (texture_size>1), but I am not sure it helps in this case or not.

@TerenceCYJ
Thanks for your quick reply!

For Q1, why not use an off-the-shelf silhouette separator to preprocess the GT? Some samples in this issue are weird, and using the correct foreground image can give good supervision to the shape of hand and reduce the impact of background.

For Q2, do you think the 1*1*1 texture_size is enough for the generation of the rendered RGB image?

Hi.

  1. Yes, I think if the off-the-shelf "silhouette separator" is accurate, then it would be helpful. But I find it is hard to find one. The rendered silhouette can not necessarily be inaccurate if the predicted 3D mesh is good (given the 2D keypoints as supervision).
  2. I think the 111 texture size of each mesh face is enough in our case.

Thanks! I am clear now.