DepthAnything/Depth-Anything-V2

Why use max_depth in metric depth prediction?

1punch3coins opened this issue · 3 comments

depth = self.depth_head(features, patch_h, patch_w) * self.max_depth

Would this line lead the model into relative depth prediction? As the max_depth is no more than a scale factor. Also, when I am fine tuning on my own dataset, how do I set the correct value? If I set the max_depth for each sample by online calculating its max depth, then how do I set the value while doing inference?

You should pre-define a global max_depth, rather than dynamically adjust it for each sample. This is because our output in metric depth estimation is in meters, which has a physical meaning. If you adjust the max_depth for each sample, then the self.depth_head(features, patch_h, patch_w) will become a relative depth within this specific sample.

Hi @LiheYoung ,

I’m looking to get more information about the max_depth parameter. Additionally, I have a question about interpreting depth prediction values from relative versus metric depth models. Specifically, how can I verify the depth values from a metric depth model if I don’t have ground truth data?

Could you also provide guidance on what steps to follow if I want to train a model on a custom dataset (outdoor data)?

I found this guide https://huggingface.co/blog/Isayoften/monocular-depth-estimation-guide is useful, maybe it can help you.