ChenWu98/cycle-diffusion

Encoding two images to obtain third image

Aravinda27 opened this issue · 4 comments

Hi thanks for the wonderful work.
Can we interpolate two image latents and feed the modified latent to generator obtain a third image like in DDIM model??

Thanks for your question! Some time ago, I tried interpolating two Gaussian latents (randomly sampled, not inferred from real images) with the Slerp interpolation method. The blending effect is similar to that of DDIM, but more experiments need to be done to understand such an effect better. However, I did not explore it further since this is not our main focus.

To interpolate especially with the face images, which combination of models are preferred whether is ddim_ddpm,latent diffusion or stable diffusion??

Latent diffusion models trained on CelebAHQ and FFHQ seem to work well; DDIM_DDPM outputs sometimes contain visible noises (for both DPM-Encoder interpolation and DDIM interpolation). I have not tried Stable Diffusion. I guess the performance of a SD model fine-tuned on human faces should be similar to those of LDMs on CelebAHQ and FFHQ.

Thank for the answer..
I will try latent diffusion models as suggested... I am closing issue...