Is this Idea also effective for Resnet backbone？

Question

Is this Idea also effective for Resnet backbone？

Closed this issue a year ago · 5 comments

Is this Idea also effective for Resnet backbone？ If so，it will extraordinary improve the latency of Real-Time Detection

Answer 1 · 2023-10-28T03:31:31.000Z

@sdreamforchen I enjoy this impressive work, but I think its performance will sharply decline, because the ResNet series does not support Masked Image Modeling, which is one of the key factors for the significant performance improvement of PlainDETR. We can infer from the data in the paper that the performance of PlainDETR-R50 will be significantly lower than that of DINO-R50 under 1x training configuration. Perhaps, we should try to solve how to implement MIM based on CNN architecture to revitalize ResNet. At that time, your concerns may be well addressed.

Answer 2 · 2023-11-16T04:54:49.000Z

HI, @sdreamforchen

I agree with @yjh0410 's opinion. The key here would be a good MIM pretraining for CNNs.

Answer 3 · 2023-11-16T05:06:14.000Z

I don’t know，but I think it maybe valuable that test this paper's method in other style cnn , such as using depth cnn or transformer in the last stage of backbone，to get different feature maps on channel dim

…

---Original--- From: "Yutong ***@***.***> Date: Thu, Nov 16, 2023 12:55 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [impiga/Plain-DETR] Is this Idea also effective for Resnet backbone？ (Issue #6) HI, @sdreamforchen I agree with @yjh0410 's opinion. The key here would be a good MIM pretraining for CNNs. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

Answer 4 · 2023-11-16T07:37:16.000Z

HI, @sdreamforchen

I agree with @yjh0410 's opinion. The key here would be a good MIM pretraining for CNNs.

How about Spark ，a unique MIM for cnn-style

Answer 5 · 2023-11-16T07:39:55.000Z

This is Spark's paper url :
https://arxiv.org/abs/2301.03580