facebookresearch/eft

A Potential "Bug" for weakProjection

Closed this issue · 2 comments

sicxu commented

Thanks for releasing the fitting code! I found that the implementation of weakProjection_gpu is "not" correct.
In simple terms, in your implementation, the projection is computed as s*pts + t. In the original HMR/SPIN implementation, it implies that the projection should be s*(pts + t). In this way, this projection can be perfectly converted to perspective projection by setting the camera translation to [tx, ty, 2*f/s * img_res].

https://github.com/MandyMo/pytorch_HMR/blob/7bf18d619aeafd97e9df7364e354cd7e9480966f/src/util.py#L117

Anyway, it a fitting procedure, the network's parameters will be updated to output the correct t under this projection.

Thanks for releasing the fitting code! I found that the implementation of weakProjection_gpu is "not" correct. In simple terms, in your implementation, the projection is computed as spts + t. In the original HMR/SPIN implementation, it implies that the projection should be s(pts + t). In this way, this projection can be perfectly converted to perspective projection by setting the camera translation to [tx, ty, 2*f/s * img_res].

https://github.com/MandyMo/pytorch_HMR/blob/7bf18d619aeafd97e9df7364e354cd7e9480966f/src/util.py#L117

Anyway, it a fitting procedure, the network's parameters will be updated to output the correct t under this projection.

您好,我在使用作者制造的数据集标签的时候,我想利用它的弱透视标签对网格进行渲染,但是我遇到了一些问题,可以向您请教一下吗,如果可以的话加我一个微信,谢谢啦,我的微信号是zzydddd

The easiest way to convert it to a weak perspective would be to get 2D image space cordinates as target_2d = ((scale * joint3d)[...,2] + translation) and original 3D joints as origin_2d = joint3d[...,:2] and estimate weak perspective translation

tmp_o = origin_2d - origin_2d.mean(axis=0)
tmp_t = target_2d - target_2d.mean(axis=0)
scale = (tmp_t * tmp_o).sum() / (tmp_o * tmp_o).sum()
trans = target_2d.mean(axis=0) / scale - origin_2d.mean(axis=0)
cam = np.zeros(3)
cam[0] = scale
cam[1:] = trans
#cam[0] = 2 * focal_length / (img_res * cam[0] + 1e-9)