Initialization for premodel of different nets
Opened this issue · 9 comments
Traceback (most recent call last):
File "./tools/trainval_net.py", line 150, in
max_iters=args.max_iters)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 365, in train_net
sw.train_model(max_iters)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 239, in train_model
lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = self.initialize()
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 179, in initialize
self.net.load_state_dict(model_dict)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/nets/network.py", line 609, in load_state_dict
nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
File "/home/nieqinqin/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in load_state_dict
.format(name, own_state[name].size(), param.size()))
RuntimeError: While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]).
Command exited with non-zero status 1
7.33user 5.80system 0:13.57elapsed 96%CPU (0avgtext+0avgdata 2171288maxresident)k
0inputs+8outputs (0major+709527minor)pagefaults 0swaps
We again check the WSDDN pretrained-model
and make sure no wrong. cls_score_net.weight
is actually a [20,4096] tensor. You should check
We again check the
WSDDN pretrained-model
and make sure no wrong.cls_score_net.weight
is actually a [20,4096] tensor. You should checkCollaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py
Line 26 in 4bd0df7
self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.pyLine 398 in 4bd0df7
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.
Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:
I'm tring to download vgg premodel, but are there any solutions to use resnet?
We again check the
WSDDN pretrained-model
and make sure no wrong.cls_score_net.weight
is actually a [20,4096] tensor. You should check
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py
Line 26 in 4bd0df7
self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py
Line 398 in 4bd0df7
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.pyLine 215 in 4bd0df7
self._fc7_channels = 2048
I'm tring to download vgg premodel, but are there any solutions to use resnet?
Hello,shawnLiu.
I encountered same error as you when I ran ./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre
While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...
after that, I set
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py
self._fc7_channels = 4096
but i encountered the RuntimeError cuda runtime error (2) : out of memory
Could you find any solution to use resnet ? Thank you
@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.
@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.
Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!
@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.
Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!
you can adjust the batch size by modifing the file ./experiments/cfgs/res101.yml
@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.
Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!
you can adjust the batch size by modifing the file
./experiments/cfgs/res101.yml
Hello! Thank you for your answer! My GPU has a capacity of 12 gb. I have changed the batch_size in. / experiments / cfgs / res101. yml file to 2, but still out of memory. How many values did you change and finally run?
We again check the
WSDDN pretrained-model
and make sure no wrong.cls_score_net.weight
is actually a [20,4096] tensor. You should check
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py
Line 26 in 4bd0df7
self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py
Line 398 in 4bd0df7
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py
Line 215 in 4bd0df7
self._fc7_channels = 2048
I'm tring to download vgg premodel, but are there any solutions to use resnet?Hello,shawnLiu.
I encountered same error as you when I ran./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre
While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...
after that, I set
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py
self._fc7_channels = 4096but i encountered the RuntimeError
cuda runtime error (2) : out of memory
Could you find any solution to use resnet ? Thank you
so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity
@Jngwl so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity