Sunarker/Collaborative-Learning-for-Weakly-Supervised-Object-Detection

Initialization for premodel of different nets

Opened this issue · 9 comments

Traceback (most recent call last):
File "./tools/trainval_net.py", line 150, in
max_iters=args.max_iters)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 365, in train_net
sw.train_model(max_iters)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 239, in train_model
lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = self.initialize()
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 179, in initialize
self.net.load_state_dict(model_dict)
File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/nets/network.py", line 609, in load_state_dict
nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
File "/home/nieqinqin/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in load_state_dict
.format(name, own_state[name].size(), param.size()))
RuntimeError: While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]).
Command exited with non-zero status 1
7.33user 5.80system 0:13.57elapsed 96%CPU (0avgtext+0avgdata 2171288maxresident)k
0inputs+8outputs (0major+709527minor)pagefaults 0swaps

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check

and
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py

Line 26 in 4bd0df7

self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py

Line 398 in 4bd0df7

self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:


I'm tring to download vgg premodel, but are there any solutions to use resnet?

Jngwl commented

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py
Line 26 in 4bd0df7
self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py
Line 398 in 4bd0df7
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py

Line 215 in 4bd0df7

self._fc7_channels = 2048

I'm tring to download vgg premodel, but are there any solutions to use resnet?

Hello,shawnLiu.
I encountered same error as you when I ran ./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre

While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...

after that, I set

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py

self._fc7_channels = 4096

but i encountered the RuntimeError cuda runtime error (2) : out of memory
Could you find any solution to use resnet ? Thank you

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

Jngwl commented

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

you can adjust the batch size by modifing the file ./experiments/cfgs/res101.yml

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

you can adjust the batch size by modifing the file ./experiments/cfgs/res101.yml

Hello! Thank you for your answer! My GPU has a capacity of 12 gb. I have changed the batch_size in. / experiments / cfgs / res101. yml file to 2, but still out of memory. How many values did you change and finally run?

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py
Line 26 in 4bd0df7
self._fc7_channels = 4096
and
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py
Line 398 in 4bd0df7
self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class
in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here:
Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py
Line 215 in 4bd0df7
self._fc7_channels = 2048
I'm tring to download vgg premodel, but are there any solutions to use resnet?

Hello,shawnLiu.
I encountered same error as you when I ran ./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre

While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...

after that, I set

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py
self._fc7_channels = 4096

but i encountered the RuntimeError cuda runtime error (2) : out of memory
Could you find any solution to use resnet ? Thank you

so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity

@Jngwl so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity