Some question about resnet

Question

Some question about resnet

shihaobai opened this issue 5 years ago · 6 comments

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

Answer 1 · 2020-03-18T13:48:51.000Z

I've encountered the same problem.
I was training RelationNet+ResNet18 today, and I found the model overfitting at around epoch 130 (200 episodes per epoch), i.e., the val accuracy started to drop from ~64% to ~60%.
I wonder if you solved your problem later? Thank you very much.
@shihaobai

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

Answer 2 · 2020-10-31T03:32:22.000Z

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

yes, the overfitting happened even in resnet10 with protonet! have you made it, shihaobai?

Answer 3 · 2020-11-04T07:00:56.000Z

Pretraining the backbone on the train dataset can effectively avoid overfitting.

Answer 4 · 2020-11-04T12:45:44.000Z

Pretraining the backbone on the train dataset can effectively avoid overfitting.

pretraining is everything!

Answer 5 · 2021-08-21T04:28:31.000Z

@shihaobai Is pretrain the backbone mentioned in the paper? I only see baseline(and ++) has keyword pretraining.

Answer 6 · 2021-11-18T15:34:17.000Z

Pretraining the backbone on the train dataset can effectively avoid overfitting.
Can you tell me for how many epochs did you pretrain for? and did you use cross entropy loss to pretrain it?