dslisleedh/PLKSR

Making PLKSR stable for real-world SISR

neosr-project opened this issue · 3 comments

Hi. First of all, thanks to everyone who participated on this research. Very thorough analysis on the paper.

As reported by others in issue 3, PLKSR seems to be unstable for real-world SISR. GAN training is notoriously unstable, and causes issues even at lower learning rate.
So in an attempt to make it more stable, I have released a simple modification to PLKSR, named RealPLKSR:

  • Normalization was missing, as pointed by @dslisleedh. From my understanding, layer norm was avoided because of the impact on inference latency. I have tested multiple methods, including Instance norm, Layer norm, Batch norm, Group norm and RMSNorm. Because we usually train at lower batch sizes (<16), out of those tested, Group Normalization performed best on my experiments. The impact on inference latency was minimal (~5% max). The number of groups was also tested. Increasing it leads to better regularization, but impacts convergence speed. The value 4 offered a good balance on all tests.
  • Replacing GELU with Mish on channel mixer. Mish showed better, more stable, convergence compared to GELU.
  • Added nn.Dropout2d to the last conv, as proposed in "Reflash Dropout in Image Super-Resolution". Although not ideal, dropout is a simple method to increase generalization on real-world SISR.

Pretrained models:

scale download
4x GAN GDrive
4x GDrive
2x GDrive

Training can be done on neosr using the following configurations: paired dataset or through realesrgan degradation pipeline.
Credits were acknowledged inside the code and released under the same license as PLKSR (MIT). I hope this makes PLKSR more used under real-world degradations. It's a really impressive network. Thanks again for your research 👍

Thank you for your interest in this work, we are impressed with RealPLKSR's ability to stably learn real-world SISR tasks while maintaining low latency. We will add this issue and implementation to the readme so that many people can utilize your work!

Just to add to this thread, I trained and released a RealPLKSR model on a dataset I degraded with a bit of lens blur, a bit of realistic noise, and a bit of jpg and webp (re)compression for photography.

The models and all the info to it can be found in its Github Release here

Some examples of my RealPLKSR model for visualization:
334438914-763be319-fd46-4ca8-a4c2-e71e66b7cbc7
334438906-ebbb01d3-c0c7-427d-ba8b-a77797255f59
334438920-e713fde1-d5d2-4355-905f-9756482b5e6c
334438924-624e054a-913a-431c-97a5-8406c5602151
334438928-42196634-c486-4b2b-adce-340f36af87fe
334438931-d7d94828-7cdd-4e9d-9de3-497820899372