Zj-BinXia/DiffIR

Question about space of IPR

Closed this issue · 2 comments

Thank you for your open-source code which contributes to the community!

I have a question about the IPRs. The DIRformer you proposed in your paper can be considered as an auto-encoder which takes IPRs and LR images as letent embeddings. During inference, the IPRs are actually generated from your DM conditioned on LR images.

However, it is a consensus that the latent spaces of auto-encoders are discrete. But IPRs (as one part of latent embeddings) are generated from DMs, meaning that the IPRs will be continuous. In order to address the problem of discrete latent spaces, a common method is to adopt a KL divergence of latent embeddings so that the latent embeddings can be continuous, as VAEs have done. However, it seems that you did not constrain the space of IPR. So, how can the generated IPRs from DMs, which are continuous, be leveraged as latent embeddings of the DIRformer whose latent space is discrete? How does it work?

Looking forward to your reply! Thank you again for your excellent work.

Thank you for your interest. As mentioned in the paper, I found that the previous LDM's VAE and Unet were trained separately, using KL loss for constraint. However, this was not conducive to accurate restoration. Therefore, I chose to jointly train DIRformer and DM, eliminating the need for KL loss in a more natural manner.

Thank you for your interest. As mentioned in the paper, I found that the previous LDM's VAE and Unet were trained separately, using KL loss for constraint. However, this was not conducive to accurate restoration. Therefore, I chose to jointly train DIRformer and DM, eliminating the need for KL loss in a more natural manner.

Get it. Thank you for your reply!