cc-ai/climategan

Improve depth learning + evaluation

vict0rsch opened this issue · 3 comments

https://arxiv.org/pdf/2003.06620.pdf

lets use this issue's comments to debrief: interesting stuff, questions etc.

What we can keep in mind from some of the models presented in this overview:
The authors explicitly discourage use of the depth prediction model on higher resolution than the one is was trained on.
Also, it seems a lot of them just take NYU depth dataset, or KITTI dataset to benchmark on, but in our case it's very important to have a model that's robust across different types of images.
Hence, we'll focus on this model MiDaS for the moment

We implemented a loss with gradient matching (GM) + scale invariant MSE terms inspired by MiDaS paper.
Actually MiDaS uses 4 scale levels for scale invariant GM, halving the image resolution at each level. Maybe I should implement that @vict0rsch ?
The thing is that the original gt depth maps were computed on resized input image and then upscaled. I wonder if that can have any impact on a multi scale GM loss.

just to keep this reference somewhere: DADA uses the reverse Huber loss

architectures :
Megadepth : Hourglass network from this paper - composed of modified inception modules
MiDaS : resnet-based architecture from this paper
DADA: residual auxiliary block - encoded features (before the depth pooling) are decoded by a convolutional layer and fused with the backbone features.