几处不明白的地方
Closed this issue · 4 comments
我有几处不明白的地方,希望您不吝赐教。
1.readme写的78.7map,百度云盘链接实际却是77.8map
2.我用voc_77.8.pth进行detect.py,不匹配,没办法跑起来。voc_77.8.pth只能用来验证,不能推理?
3.我下载了VectXmy的voc推理权重,可以跑起来,可是图片没有任何预测的结果
我是小白,很多不懂,希望您能解答一下
我有几处不明白的地方,希望您不吝赐教。
1.readme写的78.7map,百度云盘链接实际却是77.8map
2.我用voc_77.8.pth进行detect.py,不匹配,没办法跑起来。voc_77.8.pth只能用来验证,不能推理?
3.我下载了VectXmy的voc推理权重,可以跑起来,可是图片没有任何预测的结果
我是小白,很多不懂,希望您能解答一下
同样的问题:训练完成后,用detect.py跑不起来,无法测试图片?请问楼上解决这个问题了吗
@Kuuuo @Quanyin-li 可以提供一下报错信息么?
@Kuuuo @Quanyin-li 可以提供一下报错信息么?
谢谢您,我已经解决这个问题啦。下面是我遇到的两个报错信息:
第一个报错信息:
share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py
Traceback (most recent call last):
File "detect.py", line 87, in
model.load_state_dict(torch.load("./checkpoint/model_1.pth",map_location=torch.device('cuda')))
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.body.head.cls_out.weight: copying a param with shape torch.Size([720, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([45, 256, 3, 3]).
size mismatch for module.body.head.cls_out.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([45]).
我的出错原因是detect.py中的类别数和config中的类别数不一致,改成一样的就可以了。
第二个报错信息:
share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py
info====>success freeze bn
info=====> success freeze stage 1
===>success loading model
Traceback (most recent call last):
File "detect.py", line 108, in
out=model(img1.unsqueeze_(dim=0))
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward
"them on device: {}".format(self.src_device_obj, t.device))
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu
第二个原因是因为没有指定gpu,我添加了一份使用gpu和cuda的代码可以了。
添加如下:
USE_CUDA = torch.cuda.is_available()
device = torch.device("cuda:0" if USE_CUDA else "cpu")
model = torch.nn.DataParallel(model, device_ids=[0, 1])
# model = torch.nn.DataParallel(model)
model.to(device)
@Quanyin-li 明白 config里面的cls_num我会去fixed一下