about the train.py file line:119

Question

about the train.py file line:119

gag1223 opened this issue 6 years ago · 7 comments

for global_output, label in zip(global_outputs, targets):
num_points = global_output.size()[1]
global_label = label * (valid > 1.1).type(torch.FloatTensor).view(-1, num_points, 1, 1)
global_loss = criterion1(global_output,
torch.autograd.Variable(global_label.cuda(async=True))) / 2.0
loss += global_loss
global_loss_record += global_loss.data.item()
上面的代码是不是有错误：
global_outputs should be reversed?

Answer 1 · 2019-04-17T10:39:12.000Z

Thanks for your attention! Yep, the global_outputs should be reversed to match the label maps. Sorry for the mistake.

Answer 2 · 2019-04-17T10:42:29.000Z

It seems that we need to fine-tune the pre-trained model again.

Answer 3 · 2019-05-06T08:07:52.000Z

@mingloo @GengDavid Why global_outputs should be reversed? I think the feature map of low resolution should calculate loss with the heat map of ground truth which has large sigma.(such as label15 and the feature map of lowest resolution)

Answer 4 · 2019-05-08T07:27:18.000Z

Hi @moontsar @GengDavid @gag1223

I'll check this issue and feedback here.

Answer 5 · 2020-01-21T08:56:06.000Z

The input_size is 256x192 and the backbone is resnet50. I have reversed the global_outputs which match the targets, other settings are the same as the ones of the original code, I got the following results. I need three days to train the models.
Epoch LR Train Loss
1.000000 0.000500 280.276247
2.000000 0.000500 211.198144
3.000000 0.000500 199.642633
4.000000 0.000500 193.063355
5.000000 0.000500 188.156280
6.000000 0.000500 184.408313
7.000000 0.000250 175.035787
8.000000 0.000250 171.783389
9.000000 0.000250 169.809099
10.000000 0.000250 168.132656
11.000000 0.000250 166.636110
12.000000 0.000250 165.332796
13.000000 0.000125 160.551377
14.000000 0.000125 158.958218
15.000000 0.000125 157.665254
16.000000 0.000125 156.639992
17.000000 0.000125 155.647048
18.000000 0.000125 154.823640
19.000000 0.000063 152.151177
20.000000 0.000063 151.089816
21.000000 0.000063 150.475921
22.000000 0.000063 149.871324

Answer 6 · 2020-01-23T13:14:41.000Z

The input_size is 256x192 and the backbone is resnet50. I have reversed the global_outputs which match the targets, other settings are the same as the ones of the original code, I got the following results with ground-truth labels on the COCO2017 val set.

Epoch LR Train Loss
1.000000 0.000500 280.276247
2.000000 0.000500 211.198144
3.000000 0.000500 199.642633
4.000000 0.000500 193.063355
5.000000 0.000500 188.156280
6.000000 0.000500 184.408313
7.000000 0.000250 175.035787
8.000000 0.000250 171.783389
9.000000 0.000250 169.809099
10.000000 0.000250 168.132656
11.000000 0.000250 166.636110
12.000000 0.000250 165.332796
13.000000 0.000125 160.551377
14.000000 0.000125 158.958218
15.000000 0.000125 157.665254
16.000000 0.000125 156.639992
17.000000 0.000125 155.647048
18.000000 0.000125 154.823640
19.000000 0.000063 152.151177
20.000000 0.000063 151.089816
21.000000 0.000063 150.475921
22.000000 0.000063 149.871324
23.000000 0.000063 149.287184
24.000000 0.000063 148.672204
25.000000 0.000031 147.222937
26.000000 0.000031 146.709794
27.000000 0.000031 146.274089
28.000000 0.000031 145.856368
29.000000 0.000031 145.591264 (model_best)
30.000000 0.000031 145.212220
31.000000 0.000016 144.505038
32.000000 0.000016 144.212379

I have uploaded the trained models to https://pan.baidu.com/s/1w4prqCMV2AjORks2AvCR3w

Answer 7 · 2020-12-19T01:17:00.000Z

for global_output, label in zip(global_outputs, targets):
num_points = global_output.size()[1]
global_label = label * (valid > 1.1).type(torch.FloatTensor).view(-1, num_points, 1, 1)
global_loss = criterion1(global_output,
torch.autograd.Variable(global_label.cuda(async=True))) / 2.0
loss += global_loss
global_loss_record += global_loss.data.item()
上面的代码是不是有错误：
global_outputs should be reversed?

请问怎么reversed呢？