karfly/learnable-triangulation-pytorch

About features unprojection in vol method

agenthong opened this issue · 5 comments

It's a pretty nice work, and I got a question that why don't you directly use heatmaps got from 2d backbone to unproject to get the volume.

@karfly Thanks for replying.

Hi, @agenthong.
We use a pretrained backbone => the final layer of this backbone is specifically trained to predict heatmaps of 2D keypoint locations. V2V might not need such "specialized" heatmaps for unprojection, so to keep "richer" information about input image we unproject features from the penultimate layer.

Thank you for great work!
Using final layer instead of it, does the performance of network get worse?

To be honest, we didn’t try. But it’s a common practice to extract features not from the last layer, but from the intermediate one.

If you try, please, let us know the results.

Hi, @agenthong.
We use a pretrained backbone => the final layer of this backbone is specifically trained to predict heatmaps of 2D keypoint locations. V2V might not need such "specialized" heatmaps for unprojection, so to keep "richer" information about input image we unproject features from the penultimate layer.

I've tried to use heatmaps for unprojectiong. The result is 24.82305278541953 and the result of using features is 24.695624140302613. Despite of richer information in features, it seems that there is no impact if I use heatmaps.