Lower results using pretrained model from torchvision
gasvn opened this issue · 1 comments
❓ Questions and Help
I used the pretrained ResNet50 model form torchvison to init the mask_rcnn_R_50_FPN model.
But the performance is about 1% lower than the Caffe2 pretrained results.
Is this normal?
Or is there any solution for issuing it?
pytorch pretained:
[('bbox', OrderedDict([('AP', 0.36520976202958577), ('AP50', 0.5761235657158805), ('AP75', 0.3946176735076137), ('APs', 0.2018171617661631), ('APm', 0.3904857258613571), ('APl', 0.4803073309949115)])), ('segm', OrderedDict([('AP', 0.33394594208086187), ('AP50', 0.5461901740014163), ('AP75', 0.3530760040683633), ('APs', 0.14512204511383409), ('APm', 0.35617566558529673), ('APl', 0.4932441063072069)]))])
Caffe2 pretrained from Model_zoo.
R-50-FPN | Mask | 1x | 2 | 5.2 | 0.4536 | 11.3 | 0.12966 + 0.034 |37.8 | 34.2 | 6358792
All my configurations are follow the 'e2e_mask_rcnn_R_50_FPN_1x.yaml' except these changes:
INPUT:
PIXEL_MEAN: [0.485, 0.456, 0.406]
PIXEL_STD: [0.229, 0.224, 0.225]
TO_BGR255: False
8-2080TI is used in training.
@gasvn I believe this is expected, and a similar behavior was observed by mmdetection.
They observed that for 1x schedule, using the model from torchvision would give slightly worse results, but for 2x schedule it would work better.