Questian about the choice of base-learner
Sebastian-X opened this issue · 4 comments
You used ResNet-12 as base-learner, and it's also a common choice in recent works. Does it mean that ResNet-12 is a super efficient model for few-shot learning? Is there any paper talks about it? I go throw your paper's related citations, but don't really find any information about this.
Also I see you deployed a ResNet version MAML during experiments whose performance over took the original one's, did you just change the base-learner of MAML and remain other parts the same?
Thanks for your interest in our work.
Answer to Q1: ResNet-12 is an example of deeper networks compared to 4CONV. It is not the most efficient network architecture. I use ResNet-12 in my paper for fair comparisons with the related works. If you'd like to read some papers on the network architecture of few-shot learning, I suggest this one: A Closer Look at Few-shot Classification.
Answer to Q2: We have provided ablative results for MAML on ResNet-12 in the paper. However, it is not the result in the image you attached. The result in the image is for the "MAML+HT" setting, where HT meta-batch is applied. You may find the details in the paper.
If you have any further questions, feel free to add comments.
Thanks for your response!
I see your paper's ablation experiments, but there's still a point I don't understand. If I did't get it wrong, during meta-transfer learning phase, parameters of feature extractor are fixed while parameters of FC and SS layers are updated. However, the last 2 rows of this table shows the results of SS[Θ4;θ] and SS[Θ;θ] whose notation seems to indicate that parameters of feature extractor Θ4/Θ are also fine tuned. I'm a little confused about this, and if i misunderstood this table could you please tell me the difference between SS[Θ4;θ] and SS[Θ;θ]?
SS[Θ;θ] means that we update SS weights for all convolutional layers Θ and the last fully-connected layer θ;
SS[Θ4;θ] means that we update SS weights for the 4th residual block Θ4 of ResNet-12 and the last fully-connected layer θ.
The details are available in the extended version: https://arxiv.org/pdf/1910.03648.pdf
I see. Thank you very much!