Pretrained weights transfer to d2 evaluation result not same
luohao123 opened this issue · 3 comments
Hi, I found pretrained weights can transfer into d2 version, but the evaluation is not same:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.391
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.579
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.419
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.204
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.438
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.555
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.326
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.534
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.601
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.343
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.665
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.833
[11/22 15:46:16 d2.evaluation.coco_evaluation]: Evaluation results for bbox:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 39.102 | 57.949 | 41.949 | 20.385 | 43.796 | 55.472 |
[11/22 15:46:16 d2.evaluation.coco_evaluation]: Per-category bbox AP:
| category | AP | category | AP | category | AP |
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
| person | 51.545 | bicycle | 30.120 | car | 37.343 |
| motorcycle | 41.354 | airplane | 64.191 | bus | 64.393 |
| train | 66.267 | truck | 30.056 | boat | 24.152 |
| traffic light | 21.103 | fire hydrant | 64.068 | stop sign | 59.821 |
| parking meter | 44.773 | bench | 23.255 | bird | 30.513 |
| cat | 72.177 | dog | 66.586 | horse | 57.632 |
| sheep | 51.321 | cow | 55.309 | elephant | 63.624 |
| bear | 72.234 | zebra | 66.314 | giraffe | 68.215 |
| backpack | 9.502 | umbrella | 38.094 | handbag | 10.717 |
| tie | 28.595 | suitcase | 37.492 | frisbee | 62.511 |
| skis | 22.858 | snowboard | 32.119 | sports ball | 35.770 |
| kite | 36.131 | baseball bat | 30.962 | baseball glove | 32.334 |
| skateboard | 50.383 | surfboard | 34.870 | tennis racket | 46.793 |
| bottle | 29.972 | wine glass | 31.864 | cup | 38.972 |
| fork | 35.053 | knife | 14.528 | spoon | 13.116 |
| bowl | 36.399 | banana | 19.322 | apple | 19.359 |
| sandwich | 34.859 | orange | 27.864 | broccoli | 19.962 |
| carrot | 16.043 | hot dog | 37.709 | pizza | 49.721 |
| donut | 45.727 | cake | 37.040 | chair | 25.170 |
| couch | 42.012 | potted plant | 24.735 | bed | 44.549 |
| dining table | 28.283 | toilet | 59.902 | tv | 55.940 |
| laptop | 59.224 | mouse | 53.997 | remote | 22.803 |
| keyboard | 47.817 | cell phone | 30.583 | microwave | 54.825 |
| oven | 33.776 | toaster | 27.688 | sink | 31.062 |
| refrigerator | 56.144 | book | 7.453 | clock | 45.249 |
| vase | 30.723 | scissors | 26.979 | teddy bear | 46.190 |
| hair drier | 11.617 | toothbrush | 18.464 | | |
[11/22 15:46:17 d2.engine.defaults]: Evaluation results for coco_2017_val in csv format:
[11/22 15:46:17 d2.evaluation.testing]: copypaste: Task: bbox
[11/22 15:46:17 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[11/22 15:46:17 d2.evaluation.testing]: copypaste: 39.1024,57.9486,41.9489,20.3847,43.7956,55.4719
I wonder why there is a gap between then ?
And I found the, it should using res5 output as transformer input in single scale, but I forcely using res2 as input of transformer, but result I got almost same AP...
Do u know why? This is very weired.
@luohao123 Hi, there may be three possibilities you should check.
- Does the weight match the model? R-50 or R101? C5 or DC5?
- The PostProcess follows the Deformable DETR but not the DETR.
- You should evaluate with 1 img/card as the models are trained without image padding. If you want to evaluate with multiple images per card, you can random pad the images with a few pixels in training.
I do not know what happened in your code. You can print the shape of the feature in the transformer and I guess it may be still the previous feature.
And I found the, it should using res5 output as transformer input in single scale, but I forcely using res2 as input of transformer, but result I got almost same AP...
@tangjiuqi097 How, now I get 41.7 mAP almost same C5.
This issue is not active for a long time and it will be closed in 5 days. Feel free to re-open it if you have further concerns.