Lower results using pretrained model from torchvision

Question

Lower results using pretrained model from torchvision

gasvn opened this issue 6 years ago · 1 comments

❓ Questions and Help

I used the pretrained ResNet50 model form torchvison to init the mask_rcnn_R_50_FPN model.
But the performance is about 1% lower than the Caffe2 pretrained results.
Is this normal?
Or is there any solution for issuing it?
pytorch pretained:
[('bbox', OrderedDict([('AP', 0.36520976202958577), ('AP50', 0.5761235657158805), ('AP75', 0.3946176735076137), ('APs', 0.2018171617661631), ('APm', 0.3904857258613571), ('APl', 0.4803073309949115)])), ('segm', OrderedDict([('AP', 0.33394594208086187), ('AP50', 0.5461901740014163), ('AP75', 0.3530760040683633), ('APs', 0.14512204511383409), ('APm', 0.35617566558529673), ('APl', 0.4932441063072069)]))])

Caffe2 pretrained from Model_zoo.
R-50-FPN | Mask | 1x | 2 | 5.2 | 0.4536 | 11.3 | 0.12966 + 0.034 |37.8 | 34.2 | 6358792

https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/MODEL_ZOO.md#end-to-end-faster-and-mask-r-cnn-baselines

All my configurations are follow the 'e2e_mask_rcnn_R_50_FPN_1x.yaml' except these changes:

INPUT:
  PIXEL_MEAN: [0.485, 0.456, 0.406]
  PIXEL_STD: [0.229, 0.224, 0.225]
  TO_BGR255: False

8-2080TI is used in training.

Answer 1 · 2019-02-28T09:42:38.000Z

@gasvn I believe this is expected, and a similar behavior was observed by mmdetection.
They observed that for 1x schedule, using the model from torchvision would give slightly worse results, but for 2x schedule it would work better.