SPengLiang/DID-M3D

vis_depth = (-vis_depth).exp() ?

Closed this issue · 2 comments

Thanks for your nice work!!
But I'm still confused that the visual depth is smoothed here, while it is not mentioned in the paper. May I ask why you want to deal with it this way?

Thanks for your interest! You can refer to "Depth map prediction from a single image using a multi-scale deep network" (BTW, you can remove the minus in our code, it should be OK.). This smoothed depth is an early implementation for our work, and we did not pay much attention to this transformation. Actually, there are many other ways to transform the absolute depth, e.g., normalized depth with scale factor (BTS:https://github.com/cleinc/bts), pre-defined scale and shift (SMOKE: https://github.com/lzccccc/SMOKE), disparity(see stereo-based works) . Such methods should have similar final performance for monocular 3D detection.

Get it, thanks!!