GengDavid/pytorch-cpn

about the train.py file line:119

gag1223 opened this issue · 7 comments

for global_output, label in zip(global_outputs, targets):
num_points = global_output.size()[1]
global_label = label * (valid > 1.1).type(torch.FloatTensor).view(-1, num_points, 1, 1)
global_loss = criterion1(global_output,
torch.autograd.Variable(global_label.cuda(async=True))) / 2.0
loss += global_loss
global_loss_record += global_loss.data.item()
上面的代码是不是有错误:
global_outputs should be reversed?

Thanks for your attention! Yep, the global_outputs should be reversed to match the label maps. Sorry for the mistake.

It seems that we need to fine-tune the pre-trained model again.

@mingloo @GengDavid Why global_outputs should be reversed? I think the feature map of low resolution should calculate loss with the heat map of ground truth which has large sigma.(such as label15 and the feature map of lowest resolution)

Hi @moontsar @GengDavid @gag1223

I'll check this issue and feedback here.

The input_size is 256x192 and the backbone is resnet50. I have reversed the global_outputs which match the targets, other settings are the same as the ones of the original code, I got the following results. I need three days to train the models.
Epoch LR Train Loss
1.000000 0.000500 280.276247
2.000000 0.000500 211.198144
3.000000 0.000500 199.642633
4.000000 0.000500 193.063355
5.000000 0.000500 188.156280
6.000000 0.000500 184.408313
7.000000 0.000250 175.035787
8.000000 0.000250 171.783389
9.000000 0.000250 169.809099
10.000000 0.000250 168.132656
11.000000 0.000250 166.636110
12.000000 0.000250 165.332796
13.000000 0.000125 160.551377
14.000000 0.000125 158.958218
15.000000 0.000125 157.665254
16.000000 0.000125 156.639992
17.000000 0.000125 155.647048
18.000000 0.000125 154.823640
19.000000 0.000063 152.151177
20.000000 0.000063 151.089816
21.000000 0.000063 150.475921
22.000000 0.000063 149.871324

epoch22
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.713
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.915
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.792
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.685
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.758
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.745
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.924
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.811
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.713
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.794

The input_size is 256x192 and the backbone is resnet50. I have reversed the global_outputs which match the targets, other settings are the same as the ones of the original code, I got the following results with ground-truth labels on the COCO2017 val set.

Epoch LR Train Loss
1.000000 0.000500 280.276247
2.000000 0.000500 211.198144
3.000000 0.000500 199.642633
4.000000 0.000500 193.063355
5.000000 0.000500 188.156280
6.000000 0.000500 184.408313
7.000000 0.000250 175.035787
8.000000 0.000250 171.783389
9.000000 0.000250 169.809099
10.000000 0.000250 168.132656
11.000000 0.000250 166.636110
12.000000 0.000250 165.332796
13.000000 0.000125 160.551377
14.000000 0.000125 158.958218
15.000000 0.000125 157.665254
16.000000 0.000125 156.639992
17.000000 0.000125 155.647048
18.000000 0.000125 154.823640
19.000000 0.000063 152.151177
20.000000 0.000063 151.089816
21.000000 0.000063 150.475921
22.000000 0.000063 149.871324
23.000000 0.000063 149.287184
24.000000 0.000063 148.672204
25.000000 0.000031 147.222937
26.000000 0.000031 146.709794
27.000000 0.000031 146.274089
28.000000 0.000031 145.856368
29.000000 0.000031 145.591264 (model_best)
30.000000 0.000031 145.212220
31.000000 0.000016 144.505038
32.000000 0.000016 144.212379

epoch29
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.714 (0.7143865620050341)
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.914
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.791
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.686
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.756
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.746
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.925
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.811
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.714
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.794

I have uploaded the trained models to https://pan.baidu.com/s/1w4prqCMV2AjORks2AvCR3w

for global_output, label in zip(global_outputs, targets):
num_points = global_output.size()[1]
global_label = label * (valid > 1.1).type(torch.FloatTensor).view(-1, num_points, 1, 1)
global_loss = criterion1(global_output,
torch.autograd.Variable(global_label.cuda(async=True))) / 2.0
loss += global_loss
global_loss_record += global_loss.data.item()
上面的代码是不是有错误:
global_outputs should be reversed?

请问怎么reversed呢?