latent-diffusion

An implementation of text-to-image via High-Resolution Image Synthesis with Latent Diffusion Models by Rombach et. al with a focus on training on a TPU-v3-8 VM. Includes full training code for the VAE and DM with minimal changes from the original paper.

Non-cherry-picked generated outputs after 90k steps at batch size 1024, lr=1e-4, ~25 million img subset of laion2B-en:

Loss curve:

latentCall145/latent-diffusion

latent-diffusion