几处不明白的地方

Question

几处不明白的地方

Closed this issue 4 years ago · 4 comments

我有几处不明白的地方，希望您不吝赐教。
1.readme写的78.7map，百度云盘链接实际却是77.8map
2.我用voc_77.8.pth进行detect.py，不匹配，没办法跑起来。voc_77.8.pth只能用来验证，不能推理？
3.我下载了VectXmy的voc推理权重，可以跑起来，可是图片没有任何预测的结果
我是小白，很多不懂，希望您能解答一下

Answer 1 · 2021-04-05T08:01:07.000Z

我有几处不明白的地方，希望您不吝赐教。
1.readme写的78.7map，百度云盘链接实际却是77.8map
2.我用voc_77.8.pth进行detect.py，不匹配，没办法跑起来。voc_77.8.pth只能用来验证，不能推理？
3.我下载了VectXmy的voc推理权重，可以跑起来，可是图片没有任何预测的结果
我是小白，很多不懂，希望您能解答一下

同样的问题：训练完成后，用detect.py跑不起来，无法测试图片？请问楼上解决这个问题了吗

Answer 2 · 2021-04-05T09:56:49.000Z

@Kuuuo @Quanyin-li 可以提供一下报错信息么？

Answer 3 · 2021-04-05T10:12:49.000Z

@Kuuuo @Quanyin-li 可以提供一下报错信息么？

谢谢您，我已经解决这个问题啦。下面是我遇到的两个报错信息：

第一个报错信息：
share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py
Traceback (most recent call last):
File "detect.py", line 87, in
model.load_state_dict(torch.load("./checkpoint/model_1.pth",map_location=torch.device('cuda')))
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.body.head.cls_out.weight: copying a param with shape torch.Size([720, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([45, 256, 3, 3]).
size mismatch for module.body.head.cls_out.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([45]).

我的出错原因是detect.py中的类别数和config中的类别数不一致，改成一样的就可以了。

第二个报错信息：
share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py
info====>success freeze bn
info=====> success freeze stage 1
===>success loading model
Traceback (most recent call last):
File "detect.py", line 108, in
out=model(img1.unsqueeze_(dim=0))
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/share/.local/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward
"them on device: {}".format(self.src_device_obj, t.device))
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu
第二个原因是因为没有指定gpu，我添加了一份使用gpu和cuda的代码可以了。
添加如下：
USE_CUDA = torch.cuda.is_available()
device = torch.device("cuda:0" if USE_CUDA else "cpu")
model = torch.nn.DataParallel(model, device_ids=[0, 1])
# model = torch.nn.DataParallel(model)
model.to(device)

Answer 4 · 2021-04-05T10:28:49.000Z

@Quanyin-li 明白 config里面的cls_num我会去fixed一下