aliyun/NeWCRFs

KITTI Crops

serizba opened this issue · 2 comments

Thanks for releasing the code of your work!

I've realized that when you evaluate on the KITTI dataset you first perform the kb_crop and then the garg_crop. I don't think that's the expected behavior.

When performing the evaluation, on the dataloader you do:

if self.args.do_kb_crop is True:
  height = image.shape[0]
  width = image.shape[1]
  top_margin = int(height - 352)
  left_margin = int((width - 1216) / 2)
  image = image[top_margin:top_margin + 352, left_margin:left_margin + 1216, :]
  if self.mode == 'online_eval' and has_valid_depth:
      depth_gt = depth_gt[top_margin:top_margin + 352, left_margin:left_margin + 1216, :]

So you are cropping both the image and the ground truth. Then, on eval.py, you:

if args.do_kb_crop:
    height, width = gt_depth.shape
    top_margin = int(height - 352)
    left_margin = int((width - 1216) / 2)
    pred_depth_uncropped = np.zeros((height, width), dtype=np.float32)
    pred_depth_uncropped[top_margin:top_margin + 352, left_margin:left_margin + 1216] = pred_depth
    pred_depth = pred_depth_uncropped

But this actually does nothing, as the ground truth depth has already been cropped.

Your evaluation code is based on the work of BTS, but in their code they do not kb_crop the ground truth when evaluating. They crop the input image with both kb_crop and garg_crop, but the ground truth only with garg_crop. That means that, as I understood the code, your evaluation code and the BTS perform a different cropping on the ground truth.

Evaluating on the ground truth only with the garg_crop (and inputs with both crops as in BTS) worsens the results.

Am I missing something?

thanks a lot!

We use the online evaluation code from BTS without any change.

Hi @weihaosky, thanks for your answer!

If I understood the code correctly, in BTS they are loading the ground truth again when evaluating (instead of using the one provided by the dataloader). Specifically, in this line, they load the GT and they do not crop it. But, I think that when NeWCRFs, and also Adabins, obtain the ground truth, they get it from the dataloader, and thus is cropped.

Thanks for your time!