IVRL/w2s

network performance

Opened this issue · 5 comments

Hi,

As part of a university assignment, I wish to improve the 'ours' modeI and I have a couple of questions:

  1. I used 'epoch_49' weights to test the 'ours' model and got worse PSNR and SSIM mean results than the results mentioned in your paper (got PSNR = 24.46, SSIM =0.711 instead of PSNR = 25.17 and SSIM=0.713). Am I doing something wrong?
    I have to say that the code results are very close to the paper results, however they are worse than the RDN and ESRGAN paper results.

  2. I would like to improve the RRDB baseline model and I read that you have used RDN and ESRGAN for comparison. Have you tried using RDN with texture loss?

  3. Have you tried perceptual loss weighting? By weighting I mean summing up all of the perceptual loss parts with different factors.

  4. I have noticed that there is a big variance in the PSNR results (for some images you get very low PSNR and for some of them you get really big PSNR) which suggests the the model should be more variant to different images. Have you thought about using hyper-network (or meta learning) architecture in order to make the model more dynamic and responsive to the input image?

Of course I'll be happy to contribute my code if I manage to get some improvements :)

Thanks so much!
Ofir

In addition, if you have any other suggestions of how to improve the model I'll be happy to hear about them :)

Hi,

Thank you for your interest in our work. Please find below our answers:

  1. Indeed we had accidentally uploaded a model from an old experiment, this network was only trained using MSEloss and perceptual loss! The new model is uploaded, please check this commit and let us know if you still have any issues with the PSNR/SSIM performance. We will make a note in the readme.
  2. We did not try training RDN with texture loss.
  3. We only did some factor tuning. Note that increasing the factors for perceptual/texture loss might result in worse PSNR/SSIM performance but also in images with better visual quality.
  4. One of our observations is that images captured with different wavelengths have different difficulty levels in reconstruction, this is probably due to the limitations of the capturing device (some wavelength receives weaker real signals so the noise level of the image is higher). Thus one potential way to improve the performance is to model the different wavelengths in the system, we did not do that as it was not the focus of our contribution. In our experiments, we targeted a general solution that is not specific to W2S. We did not try the architectures that you mentioned, but these could be an interesting direction for future research.

Regarding 2&3: there is this follow-up project on our W2S paper (report, code) that did among other things some parameter tuning on ESRGAN. It might be useful for your project.

We lastly want to note that a key idea behind W2S was to address a gap in the literature, namely, to show the importance of joint denoising and SR (JDSR). Applying SR networks on noisy data, or even the sequential application of a denoiser then SR, can be very detrimental to results. Therefore, the last 3 rows of Table 4 actually support this claim. The W2S dataset is meant to serve as a benchmark for evaluating future JDSR methods.

We hope this clarifies things, feel free to reach out if you have other questions.

Thanks so much for you detailed answers! I'm still doing some research :)

About the first question I've used the new epoch_49.pth and it does give the expected results at test :)
However, I'm trying to replicate your results and train the network myself unsuccessfully...
I guess it has something to do with the training batch size that I chose (batch_size = 64, ngpus=8) which is different than the default (batch_size = 16, ngpus = 3). But maybe it is something else because I get much worse results.

What are the training parameters that you used to generate epoch_49.pth?

Thanks again,
Ofir

Perfect for the test results!
Regarding the training, our parameters are the default ones in our code.
PS: we are modifying all pre-trained models to use raw data without the previous normalization that we had made. The update will be made asap in the upcoming weeks.

Thanks for the update!

After training with the default parameters on avg1 data (49 epochs) and testing it I get:
mean PSNR = 24.819
mean SSIM = 0.705

However, when using your epoch_49.pth just for testing on avg1 I get the same results as the article.
mean PSNR = 25.17
mean SSIM=0.713

Any ideas?