wyharveychen/CloserLookFewShot

Some question about resnet

shihaobai opened this issue · 6 comments

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

I've encountered the same problem.
I was training RelationNet+ResNet18 today, and I found the model overfitting at around epoch 130 (200 episodes per epoch), i.e., the val accuracy started to drop from ~64% to ~60%.
I wonder if you solved your problem later? Thank you very much.
@shihaobai

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

yes, the overfitting happened even in resnet10 with protonet! have you made it, shihaobai?

Pretraining the backbone on the train dataset can effectively avoid overfitting.

Pretraining the backbone on the train dataset can effectively avoid overfitting.

pretraining is everything!

@shihaobai Is pretrain the backbone mentioned in the paper? I only see baseline(and ++) has keyword pretraining.

Pretraining the backbone on the train dataset can effectively avoid overfitting.
Can you tell me for how many epochs did you pretrain for? and did you use cross entropy loss to pretrain it?