Megvii-BaseDetection/DisAlign

Can someone kindly share their codes of Classification task on ImageNet_LT?

smallcube opened this issue · 4 comments

I tried to train the proposed method on ImageNet_LT, but I can only get an average testing rate about 49%, which is far from the rate described in the paper (52.9). Some of the details regarding my implementations are given as follows:
(1) The feature extractor is ResNexT-50 and the head classifier is a linear classifier. The testing accuracy in Stage-One is 43.9%, which is OK.

(2) The testing accuracy of adopting cRT method in Stage-Two is 49.6%, which is identical to one reported in other papers.
(3) When fine-tuning the model in Stage-2, both the feature-extractor and head classifier are frozen, and a DisAliLinear model (which is implemented in CVPODs) is retrained. The testing accuracy can only reach 48.8%, which is far away from the one reported in your paper.

If you get a bad stage-1 model, you cannot get a good stage-2 model.

In fact, the stage-1 model DisAlign used is much better than other repos(e.g., Decoupling Representation and Classifier for Long-Tailed Recognition)

So, the comparison is not fair enough.

If you get a bad stage-1 model, you cannot get a good stage-2 model.

In fact, the stage-1 model DisAlign used is much better than other repos(e.g., Decoupling Representation and Classifier for Long-Tailed Recognition)

So, the comparison is not fair enough.

Thanks for the feedback. By now, I think the performance of Stage-1 model is OK. So I was wondering if you can kindly share more details regarding the implementation of DisAlign, especially the part about calculating the KL divergence. I would appreciate that.

@smallcube Hi, the implementation of the KL divergence is here:

@smallcube Hi, the implementation of the KL divergence is here:

Thanks for the feedback.