Inconsistent results on Online evaluation and Test (might caused by benchmark crop operation)

Question

Inconsistent results on Online evaluation and Test (might caused by benchmark crop operation)

xuanlongORZ opened this issue 3 years ago · 3 comments

Hi there, I really like this project, it is very complete and concret.
However, I found that when you do the online eval and test, it might have some issues with the benchmark_crop operation and this makes the online eval result different from the result made by the test code. The result from online eval is much better.

Online eval: benchmark crop for both image and gt in dataloader.py, the effective areas of depth_pred and gt are consistent. While the benchmark crop operation in for example bts_main.py is useless.

Test: Only a benchmark crop for the image in dataloader.py, in test.py, you put depth_pred on a zero map as large as gt, so the effective areas of depth_pred and gt are inconsistent (although the size is the same).

I notice that in Adabins GitHub repo, they used and referred to your codes, and they remove this ambiguity. To be honest I think your Eigen-split result is underestimated.

Looking forward to your response.

Answer 1 · 2022-01-11T12:05:08.000Z

Hi Xualong Yu,Thanks for your kind and thoughtful comment.We have already noticed the issue and just started to look into it because of your comment.I will let you know the progress if we have some to share.BTW, you are so kind.Thanks again.Best regards,Jin Han -------- 원본 이메일 --------발신: Xuanlong Yu ***@***.***> 날짜: 22/1/11 오후 8:37 (GMT+09:00) 받은 사람: cleinc/bts ***@***.***> 참조: Subscribed ***@***.***> 제목: [cleinc/bts] Inconsistent results on Online evaluation and Test (might caused by benchmark crop operation) (Issue #125) Hi there, I really like this project, it is very complete and concret. However, I found that when you do the online eval and test, it might have some issues with the benchmark_crop operation and this makes the online eval result different from the result made by the test code. The result from online eval is much better. Online eval: benchmark crop for both image and gt in dataloader.py, the effective areas of depth_pred and gt are consistent. While the benchmark crop operation in for example bts_main.py is useless. Test: Only a benchmark crop for the image in dataloader.py, in test.py, you put depth_pred on a zero map as large as gt, so the effective areas of depth_pred and gt are inconsistent (although the size is the same). I notice that in Adabins GitHub repo, they used and referred to your codes, and they remove this ambiguity. To be honest I think your Eigen-split result is underestimated. Looking forward to your response. —Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 2 · 2022-01-11T16:22:36.000Z

Thank you for your quick reply!
To clarify here

Test: Only a benchmark crop for the image in dataloader.py, in test.py, you put depth_pred on a zero map as large as gt, so the effective areas of depth_pred and gt are inconsistent (although the size is the same).

I wanted to say:
First cite your provided step: 1. run bts_test.py 2. run eval_with_pngs.py.
In bts_test.py the depth_pred has the same size as the cropped image (which is done in bts_dataloader.py because the mode is 'test'). In eval_with_pngs.py you put depth_pred on a zero map as large as gt (which is the original gt without any cropping), so the effective areas of depth_pred and gt are inconsistent (although the size is the same).

Compared to Adabins repo, in their evaluation code, evaluate.py, they do the evaluation directly using 'mode = 'online_eval'' see line 209. So in dataloader.py, both gt and RGB image (as well as pred_depth) are cropped with benchmark_crop, as you did in bts_main.py - online_eval.

I hope I didn't make mistakes.

Answer 3 · 2022-01-19T08:23:20.000Z

Hi, I also found this problem and I find the solution. With "Online eval", ground truth map are cropped by kb_crop in line 175 in bts_dataloader.py , but in testing mode, ground truth map are not cropped. In order to keep consistent between Online eval and Testing result, we can add crop code in eval_with_pngs.py between line 121, 122 like:

        height, width = depth.shape
        top_margin = int(height - 352)
        left_margin = int((width - 1216) / 2)
        depth = depth[top_margin:top_margin + 352, left_margin:left_margin + 1216]

Thus, the result after execute eval_with_pngs.py will same as Online eval result.