dvlab-research/PFENet

Performance inconsistency between paper and reproduce.

zhijiew opened this issue · 2 comments

Thank you for your great work. I learned a lot from your paper.

I tested the pre-trained models you provided (ResNet50-based for Pascal VOC, 1 shot), but I can get better performance than you reported in your paper.

Method Split0 Split1 Split2 Split3 Mean
Reproduce 61.8 69.9 56.3 56.6 61.2
Paper 61.7 69.5 55.4 56.3 60.8

Is this performance fluctuation within the normal range? I used the same codes and settings in you github repo.

I also tried to train another baseline experiment (ResNet50-based for Pascal VOC, 5 shots) by myself using your configs.

Method Split0 Split1 Split2 Split3 Mean
Reproduce 64.7 71.5 55.5 60.6 63.1
Paper 63.1 70.7 55.8 57.9 61.9

There seems to be a greater fluctuation.

@zhijiew
Hi, thanks for your attention.

This repo is a reproduced one that removes some redundant parameter definitions/functions compared to the original one we used to get the results in the paper. I have no idea why this repo can sometimes achieve better performance than the reported one, and the different running environments might cause the performance fluctuation.

You can find more details in this issue: #6.

The 1-shot performance variance of your reproduction is acceptable, according to the issue above.

As for the 5-shot results, in our paper, we directly tested the model trained with 1-shot with 5 support samples, but this might not be the optimal situation, which could explain why you can get a much better 5-shot result than our reported one.

Thank you.

Got it! Thank you very much for your reply!