why dont normalized the network outputs to (0-1)?

Question

why dont normalized the network outputs to (0-1)?

Closed this issue 4 years ago · 6 comments

I recently study your codes, and feel very incomprehensible about the net outputs.
Just like face detection, the net out(box coordination) will be normalized.
why in the regression model or SBR model, the outputs is not normalization activation, because the values of real landmarks or heatmaps in (0-1).

Answer 1 · 2020-11-19T04:07:49.000Z

Could you please indicate which line of code that you are referring to?

Answer 2 · 2020-11-19T05:25:00.000Z

My description may not be precise enough.
for example, in SBR/lib/models/cpm_vgg16.py, you use the batch_cpms to calculate loss. why dont normalized batch_cpms in values (0-1). because the real heatmaps label‘s numerical range is (0-1).
The regression model is the same. model's output is the final predict key position. why not normalized it, because model's output may be out of (0-1). It may make training network model harder.

Answer 3 · 2020-11-19T06:34:24.000Z

in SRT/lib/models/ProCPM.py line137: I find you use sigmoid(cpm) to normalized heatmaps.
I'm very confused about whether to use it or not. I read other face landmarks git code, I found that regression models didn't use it.
Isn't it important or not needed?

Answer 4 · 2020-11-19T08:09:26.000Z

For sigmoid, it is a hyperparameter (https://github.com/D-X-Y/landmark-detection/blob/master/SRT/lib/models/ProCPM.py#L137), and we did not use it in our experiments.

We do not normalize its value following (https://arxiv.org/pdf/1602.00134.pdf), and L2 loss with unnormalized prediction works well.

Answer 5 · 2020-11-19T08:49:02.000Z

Thank you very much for your answering, can you explain why dont use 'sigmoid' prediction? Is it because it has no effect on the prediction results?

Answer 6 · 2020-11-19T11:50:36.000Z

My intuition is this is a regression problem that does not need to use sigmoid.