About the checkpoint
InugYoon opened this issue · 2 comments
Hello, first of all thank you for sharing great project.
I am wondering if you could share the checkpoints for the resnet50 trained on ImageNet, especially for CIFAR100?
There are lots of checkpoints of resnet50 pre-trained on ImageNet, but none of them have kernel size 3 for conv1,
as far as I searched.
I am really pleased if you can share the weights, since my local computer cannot train on ImageNet due to insufficient
gpus..
Thank you for reading.
Hi @InugYoon thanks for your interest in the project!
In terms of providing checkpoints, it's something I'd like to do, but I won't be able to do it for at least a month I'm afraid. I'll keep the issue open for now to remind me.
For CIFAR-100, I actually pre-trained on CIFAR-100 rather than ImageNet, trained the downstream classifier on CIFAR-100 too and tested on the CIFAR-100 test set. Accuracy is decent, >70%, I can't remember the exact figure. Pretraining on CIFAR-100 is much easier than pretraining on ImageNet, it only needs 1 GPU and the dataset fits in memory. If this is an option, you could create your own checkpoint. If you really want to pretrain the smaller encoder on 32x32 crops from ImageNet and then test the downstream classifier on CIFAR-100 (i.e. transfer) that will take more resources.
Finally, if you have a pretrained architecture from ImageNet from another project (there are quite a few simclr projects out there and some share checkpoints), then you may be able to retrain the "stem" of the network so it takes 32x32 inputs and keep the rest the same, thereby accelerating training significantly. I cannot guarantee how this method would perform though.
Hello, thank you for your kind reply :).
I considered your recommendations.
-For the first point, unfortunately the purpose of my project is to test the transfer ability of network pre-trained by ImageNet,
so I need to pre-train on ImageNet and test on CIFAR100.
-For the second point, I got the shared ImageNet-pretrained ResNet50 weight from other project, but I am afraid re-training the stem from that backbone will result in different characteristic from 'self-supervised' learned backbone.
So I am planning to search more for projects sharing weights with stem..
Anyway thank you!