Rayhane-mamah/Efficient-VDVAE

Any plan to compute FID&IS scores?

Opened this issue ยท 5 comments

Hi!
It is a great pleasure to meet an incredible work!
I appreciate your dedication to this community.

I have checked the paper and github, but it seems that there is no sample generation performance, such as Frechet Inception Distance (FID) or Inception Score (IS).
Is there any plan to compute FID or IS in near future?

Thank you,
Dongjun Kim.

Hi @Kim-Dongjun!

Thank you for your kind words!
We also hold great respect for your work on UDM.

The main goal of our work is to make Very deep VAEs, which are log-likelihood models in essence, more efficient and accessible in general. We did consider adding sample quality performance in our metrics (such as FID or IS) as means to compare the VDVAE performance with other generative models, but we had one concern: FID (and IS) is a biased metric by design and is sensitive to numerous subtle choices in their low-level image processing methods (reference).

Since claiming better generative (or log-likelihood) performance is not the primary goal of this work, we decided to not report FID/IS in the paper and instead focus on them in possible future work that tries to improve VAEs generative quality.

Generally speaking however, one should expect Efficient-VDVAE's FID to be similar to VDVAE's FID as reported in this work. As you well know, it is common for VAEs to have worse sample quality than GANs or Diffusion models (on average), and some recent work is trying to describe these problems and solve them (dpVAE, NCP-VAE).

With that said, if there is a strong desire to measure the sample generation performance of our models, we can add the utility scripts to do so, and add the FID/IS metrics in the README. However, our priority still remains to provide pre-trained model checkpoints first.

Thank you,
Rayhane Mama.

Dear @Rayhane-mamah ,

Thank you for your fast and kind reply :)
I completely understand your priority on tasks!
I hope your work to become a VAE's cornerstone to break GAN's generation performance sooner or later.
It was an honor to meet your work.

Thank you,
Dongjun Kim.

Dear @Kim-Dongjun,

Thank you for your very nice words.
We're very happy of your feedback on FID scores. We're adding this to our potential TO-DOs as we deem this important for future work. The issue of the generative power of VAEs is likely going to keep this community busy for a little while.
Looking forward to your next exciting research!

Thank you,
Louay Hazami

I think adding FID or any quality metric is necessary, even though they are somewhat biased. It is a good way to know where VAE-based methods are set in the generative model families.

Hello @MultiPath,

Thank you for your comment. I do agree that adding the FID would be helpful. Unfortunately, however, we have moved on from this project, so we simply don't have the bandwidth to do this at the moment.

P.S: A recent work which I found quite interesting has dove into this and the results seem to improve the FID scores of SotA VAEs https://arxiv.org/abs/2210.10205