Single loss is better?

Question

Single loss is better?

JohnnieXDU opened this issue 5 years ago · 1 comments

Hi, I am trying to train PWC-net from scratch with Flying Chairs. However, during the training, I found that:

the model converges faster if only the loss from the bottom of the pyramid is used. I mean, training with single photometric loss that calculated from level-2 seems perform better than the training with those losses from Level2/3/4/5/6 with weights described in the original paper.
when I am training the model with losses from Level2, 3, 4, 5 and 6 with its weights (as described in paper), the model converges extremely slow. What i have done is to sum these losses up, then do BP. Am I right? Why the training loss always goes up and down? The curve is as follows.

Answer 1 · 2020-01-14T21:59:51.000Z

It seems that you are doing unsupervised training, i.e., photometric loss. The loss described in the paper is for supervised setting.
To compute photometric loss, you need to scale the output flow at each level to the proper scale. Please check the paper for details (regarding scaling flow before warping).