codeslake/RefVSR

Questions about L1-loss models.

sunlustar opened this issue · 3 comments

Thanks for your great work. From your results in Table 2, it seems that the model using l1 loss (Ours-l1) could outperform the model using the proposed two-stage training strategy (Ours) over 3 dB, and it seems an one-stage training process from your training code.

So,

  1. Why does the model “Ours-l1” perform better than the model “Ours”? It seems that you don't have the groundtruth of real-world HR_UW.

  2. How does one-stage training process works?

Hi, @sunlustar.

  1. Note that the compared models in Table 2 are the models trained with only the pre-training stage (Sec 4.1).
    As explained in the paper (the paragraph under Table 2), pixel-based losses are known for having an advantage over perceptual-based losses.
    In Table 2, the model Ours-l1 outperforms the model Ours in terms of PSNR, as the former model is trained with pixel-based loss (l1 loss only), where the letter model is trained with the perceptual-based loss (the contextual loss terms in Eq. 10).

  2. Please refer to Sec. 4 of the main paper, especially Sec. 4.1 for the pre-training stage.

Thanks, and I have another two questions,

  1. According to the paper, in ours-l1 model, λ_rec =0.01, λ_pre= 0.05, λ_l1 = 1, which means the l1 loss is still more predominate. But the performance gap between ours and ours-l1 is over 3dB which is drastic.

  2. Whether the model of ours-l1 in Table 3 trained by the proposed two-stage strategy or trained with only the pre-training stage?

  1. According to the paper, in ours-l1 model, λ_rec =0.01, λ_pre= 0.05, λ_l1 = 1, which means the l1 loss is still more predominate. But the performance gap between ours and ours-l1 is over 3dB which is drastic.

The model Ours-l1 is trained only with the l1 loss. I guess you are talking about model Ours with λ_rec =0.01, λ_pre= 0.05, λ_l1 = 1. As I mentioned before, the perceptual-based loss drops PSNR, which is a very well-known fact in previous studies.

  1. Whether the model of ours-l1 in Table 3 trained by the proposed two-stage strategy or trained with only the pre-training stage?

The models in Table 3 are trained with only the pre-training stage.