the loss of classification network doesn't decrease
kevinlee9 opened this issue · 9 comments
When I trained the classification network(using both pretrained vgg and resnet weights), the loss didn't decrease succesfully using given hyperparamters. For example, the loss of vgg network vibrated around 0.24 after 1k iters, I also tried the learning rate of 0.01, it also failed. Could you give me some suggestions? Thanks~
Hi kevenlee,
I'm not sure what causes the problem, but I recommend you to double-check you have loaded the pretrained weights properly. Instead using vgg16_20M.caffemodel, you may try this file to initialize VGG network.
Hi @jiwoon-ahn
I just wonder what's the difference between vgg16_20M.caffemodel and the model you provided in this issue. And where's this model originally from?
I just converted vgg16_20M.caffemodel to PyTorch format. The weights are exactly the same.
Thanks for reply. I've tried both pretrained models. The loss decreases normally in both case. This code really works well.
I tried with two gpus, learing rate 0.01 and batch size 6,the code works fine,so May I ask how many gpus are you using?and could you provide the parameter for resnet?
Hi @jiwoon-ahn
I have met the same problem that the loss of vgg network vibrated around 0.24 after 1k iters, then I try to download the model you provided in this issue, but I don't have the right to download it.
Could you give me some suggestions? Thanks!
Hi @kevinlee9
How did you finally solve the problem?
Hi @kevinlee9
How did you finally solve the problem?
I tried the weights file provided by ahn, then loss decreased.
To save the next guy some time:
If you don't want to set up caffe you can do this :
- Get the caffe weights from https://www.cs.jhu.edu/~alanlab/ccvl/init_models/
- Convert the weights to pytorch format with https://github.com/vadimkantorov/caffemodel2pytorch
- Change the loading script to load the pytorch weights
weights = torch.load('vgg16_20M.caffemodel.pt')
weights_dict = {}
for key in weights.keys():
if '.bias' in key:
weights_dict[key] = torch.squeeze(weights[key])
else:
weights_dict[key] = weights[key]