The result may not correct.
Ha0Tang opened this issue · 3 comments
Thanks for sharing this amazing paper. However, I think the accu of Cityscapes may not correct (94.4 in Table 2 of your paper) according to xh-liu/CC-FPSE#4. Can you double-check it? Thanks.
@Ha0Tang
Thank you so much for pointing out this! I have carefully checked the evaluation code and the related issues again. The official code of DRN only provides the implementation of "mIoU" but not "accu" metric, thus causing confusion.
However, I think the "not correct" may be not that absolute. Why does the gap exist? That is due to the different processing of 255-labeled pixels (xh-liu/CC-FPSE#4 (comment)) when calculating the accu.
- Previous works, SPADE and CC-FPSE, obtain the accu values around 80% by including 255-labeled pixels. For a consistent comparison, I test the model again using this setting, our method achieves 82.7% (still the best among the compared methods).
- It seems official mIoU implementation does not use the 255-labeled pixels, i.e., only calculated by the 19-class histogram (https://github.com/fyu/drn/blob/master/segment.py#L459), and the model ignores 255 labels during training (https://github.com/fyu/drn/blob/master/segment.py#L351). The reported accu 94.4% omits 255-labeled pixels, which may be more reasonable due to its consistency with the training of the segmentation model and the calculation of mIoU.
All in all, although there are several considerations that our implementation might be more reasonable, a consistent comparison is really a good idea! We will update the arXiv version very soon (using accu 82.7% our model achieves and providing more explanation).
Thanks for your detailed explanations, which make sense to me. I will cite your paper in my work EdgeGAN since 82.7 is more comparable than 94.4, that's also why I pointed this out.
@EndlessSora So how to include 255-labeled pixels when testing with DRN codes? Thanks in advance!