CVI-SZU/CLIMS

About The quality of initial CAMs

Big-Brother-Pikachu opened this issue · 5 comments

Hi, thanks for sharing this great work. I have some detail questions regarding the results in https://github.com/CVI-SZU/CLIMS#the-quality-of-initial-cams-and-pseudo-masks-on-pascal-voc2012. First, I think these results (56.6 / 58.6) is evaluated on the train set. But which one, the 1464 images original one or the 10582 images augmented one? Second, are these results (56.6 / 58.6) obtained after dCRF or not? If not, has dCRF participated in your pipeline? As far as I understand, following codes:

pred = imutils.crf_inference_label(img, fg_conf_cam, n_labels=keys.shape[0])

makes dCRF not contribute to the (56.6 / 58.6) results, but to the (70.5 / 73) results. Am I right?
Looking forward to your reply. Thanks!

Hi, dCRF was not included in the latest code to refine the generated actication maps. Note that cam_to_ir_label.py is the code to generate labels for training IRNet. Then we use IRNet to refine the generated activation maps as the pseudo semantic masks. Results of 58.6 and 73 mIOU are for train_set.

BTW, welcome to star our project!

Thank you for your quick reply! Just to confirm, the train_set is the original one with 1464 images and the 58.6 mIOU result is obtained without post-processing with dCRF. Am I correct? Thanks!
(Since I need to compare your results in my recent work, I need to carefully confirm the details so as to make a fair comparison.)

Exactly. Well, maybe you should compare the results in our cvpr version, i.e., 56.6.

Get it! Thanks for your patience and clear explanation. I will close this issue now.