Wrong depth scale when using ground-truth camera poses.

Question

Wrong depth scale when using ground-truth camera poses.

ootts opened this issue 3 years ago · 3 comments

Hi, I have a question about using ground-truth camera poses instead of predicted camera poses. I tried to use camera poses with the correct scale in the KITTI dataset, but I find the scale not correct yet. Is there anything I missed? I only changed the code as follows.

output, lowest_cost, costvol = encoder(input_color, lookup_frames,
                                                       relative_poses, # change to relative_poses_gt
                                                       K,
                                                       invK,
                                                       min_depth_bin, max_depth_bin)

Thanks a lot!

Answer 1 · 2021-06-25T15:01:41.000Z

Hi - thanks for your interest in the project!

Right yes, so the problem with this is that the depth network will be in the same scale as the pose network - some unknown, arbitrary scale.

I'm trying to think of a way to use gt pose to scale the depth estimates, but it isn't immediately obvious.

One way you could do it, would be to ask the depth + pose networks to make predictions as normal, and afterwards scale your depths by the ratio of the predicted translation and the ground truth translation. I can't guarantee that this will give a good result however, but I'd be interested to hear how you get on.

@mdfirman any thoughts?

Answer 2 · 2021-08-14T02:57:53.000Z

I tried to use gt_pose and abandon the posenet in monodepth, and the output scale is almost correct(about 0.9*gt_depth), so I assume this will work too for manydepth?
What I wonder is how can I finetune with a pretrained model, whose scale is arbitrary, to get the real-world-scale result. In monodepth I scale the groundtruth to the pretrained scale, and scale back when predict. I wonder if there's a better way to do this.

Answer 3 · 2021-12-14T07:36:37.000Z

@biggiantpigeon Hi. Have you tried using gt_pose in manydepth? I wonder whether it is feasible? Thank you!