dvlab-research/PFENet

question about resnet

Saralyliu opened this issue · 3 comments

Good work for FSS.

  1. I have a question about resnet-v2, you download it from your own path, if available, where you download it? or you pre-train it. Since resnet-v2 is different from resnet, have you noticed the effects of using the former, and the previous work like [canet] uses rennet.
  2. The support feature from layer3 multiply with the mask,
    ''supp_feat_4 = self.layer4(supp_feat_3*mask),
    final_supp_list.append(supp_feat_4)
    for i, tmp_supp_feat in enumerate(final_supp_list):
    tmp_supp_feat_4 = tmp_supp_feat * tmp_mask ''
    I notice mask operation twice, and there is only once in your paper,
    if I missed some details, which is appropriate.
    thanks!

@Saralyliu
Hi,

Thanks for your attention.

The pre-trained weights of resnet-v2 are obtained from the official repo of PSPNet (https://github.com/hszhao/semseg). The difference between the original resnet only lies in the layer0 where the v2 version applies the deep-stem strategy. We used resnet-v2 to reproduce CANet and we got rather comparable results to the ones reported in the paper of CANet.

The mask used in "supp_feat_4 = self.layer4(supp_feat_3*mask)" is used for screening out the redundant background region, and I remember that it will not affect the performance much, you can try it out by sending feat_3 to layer-4 without the masking operation.

The another mask used in "tmp_supp_feat_4 = tmp_supp_feat * tmp_mask" is more important, since it is used for the prior calculation.

Thank you for your reply. If I understand correctly, resnet-v2 or resnet-50 is the same for feature extractor? Recently, we run voc group0 with your code, train is the numbers of 5955,val is 1449, and the best mIoU we test is 58.57 at 124 epoch without any modifications. we can't get your 61.7 mIoU in 1-shot case. waiting for your suggestion, thank you!

@Saralyliu
Hi,

Thanks for your attention.

The pre-trained weights of resnet-v2 are obtained from the official repo of PSPNet (https://github.com/hszhao/semseg). The difference between the original resnet only lies in the layer0 where the v2 version applies the deep-stem strategy. We used resnet-v2 to reproduce CANet and we got rather comparable results to the ones reported in the paper of CANet.

The mask used in "supp_feat_4 = self.layer4(supp_feat_3*mask)" is used for screening out the redundant background region, and I remember that it will not affect the performance much, you can try it out by sending feat_3 to layer-4 without the masking operation.

The another mask used in "tmp_supp_feat_4 = tmp_supp_feat * tmp_mask" is more important, since it is used for the prior calculation.

Thank you for your reply. If I understand correctly, resnet-v2 or resnet-50 is the same for feature extractor? Recently, we run voc group0 with your code, train is the numbers of 5955,val is 1449, and the best mIoU we test is 58.57 at 124 epoch without any modifications. we can't get your 61.7 mIoU in 1-shot case. waiting for your suggestion, thank you!