cuiziteng/ECCV_AERIS

About the details of the experiments

Closed this issue · 4 comments

I read your paper, and it is beneficial for me. I will cite your paper if I can complete my paper.
I have some questions about the experiments and would like to consult you.

  • I would like to know if the preprocessing part in Table1, such as the DBPN, is directly using DBPN and CenterNet's pre-trained models without fine-tuning or are both trained from scratch using the COCO-d dataset, or some other implementation.
  • The AERIS's results in Table 3 (d) is 13.0 AP. Was this value obtained by testing directly on low-resolution images (x4) using a model trained in COCO-d? I did some reproduction experiments using your provided pre-trained models and it does not seem to get this value. So, I want to know how did you train the AERIS and DBPN models in this table.

Looking forward to your response.
Thank you!

Sorry for the late reply and many thanks for the interest in our paper ~

(1). The results in Table.1 the CenterNet is the original CenterNet trained on Clean COCO dataset, and the SR methods is not using any tuning and directly pre-process the LR degraded images. Otherwise I have given another version results in Table.B1 (in supp) it's the SR methods that fine-tuning on the COCO-d dataset's degradation.

(2). If you want to get the similar results in Table.3, you should train the model without noise and blurry effect, like this config, which means that you should only take low-resolution as factor.

Thanks for your reply, it is really helpful! I have tried to train the model using the config you suggest, it does obtain the value published in your paper. And I still have some confusion about Table 1, and I look forward to your response.

  • Why the bicubic (x2) would better than most CNN-based methods (x2)? I suffer from this situation in my reproduction experiments too.
  • In pre-process part (x2), I want to know if you first norm the LR image to the 0-1 range to produce the SR images with the pro-process method, and then norm the SR images by using the ‘mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]’, finally input to the CenterNet model.
  • Why would the bicubic (x2) be better than the bicubic (x4)?
    Thank you!

About the second point that I mentioned earlier, normalizing the LR image to the 0-1 range means simply dividing by 255.

Hello, thanks again for your interesting questions:

(1) The bicubic (x2) is better than many CNN methods: this may because nowadays the CNN-based methods are restrict in training domain and the generalization of CNN SR methods need to improve, also you could see the supplementary in our paper, if you finetune the CNN methods on the same degradation domain, the performance would somehow improve.

(2) The normalization in detection model is both OK, it depends on your detector training process, because the default training setting of CenterNet is ‘mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]’, if you train another CenterNet with ‘mean=[0, 0, 0], std=[255, 255, 255]’ (which means not normalization), it would not so much affect the performance.

(3) bicubic (x2) be better than the bicubic (x4): this is because the super-sampling ratio would be harmful to object detection, Table.1 is the results on COCO-d dataset and the down-sampling rate of COCO-d is random uniform from 1~4, which I think the over-sampling would up-scale the large object and middle object to the super-large object and be harmful to detection, meanwhile the original small object has been enlarged so the performance of small object detection would be improved.