google-research/l2p

Question about the paper's comparison.

wzlk655 opened this issue · 2 comments

I've read the paper and have some small questions about the paper's comparison to other models like EWC, ER and DER++ because I've not gained or maybe missed the information about the details of conducting these methods in the pretrained ViT model.
Which part of the ViT is trained or finetuned in Upper-bound and those methods and which parts are using pretrained weights? I guess only the classifier is trained but need some confirmation.

Same here, I wonder which backbone did you use for the comparison methods. The original ones in their papers, or ViT16 as L2P?

Great questions! We implement these baselines using the same ViT-B/16. For these methods, we tried to also freeze the backbone (except the classifier), however, the results is not as good as train the whole model. So we basically report the results with training the full model from the pre-trained checkpoint for the baselines.