does it work without finetuning directly using pretrained model for detection?

Question

does it work without finetuning directly using pretrained model for detection?

lucasjinreal opened this issue 2 years ago · 4 comments

Answer 1 · 2022-04-08T06:22:17.000Z

The task layer for object detection (i.e., Mask R-CNN & FPN) is not pre-trained and therefore randomly initialized, so I don't think it will work if you don't update the parameter of the task layer.

Similarly, if you don't allow the 1000-cls classifier to be trained, I believe the MAE representation cannot achieve a reasonable top-1 on ImageNet-1k val set.

Answer 2 · 2022-04-08T11:00:18.000Z

@Yuxin-CV thanks for your reply. What if frozen MIM backbone and just finetune the task layer and new classifier? Just wonder how good it can be just direcly using pretrained weights without any modification.

Answer 3 · 2022-04-08T11:55:53.000Z

@Yuxin-CV thanks for your reply. What if frozen MIM backbone and just finetune the task layer and new classifier? Just wonder how good it can be just direcly using pretrained weights without any modification.

We didn't try that, but to my knowledge， the mim represention isn't very good at linear probing in imagenet 1k，so I guess the same thing will happen on COCO if you freeze the pretrained represention.

We will try this in the future.

Answer 4 · 2022-04-08T13:37:01.000Z

Thank u. I am just out of curiosity, since mim is good at general representation from classification to object detection.