Example of using VAE for image.

Question

Example of using VAE for image.

Closed this issue 2 months ago · 3 comments

Does anyone succeed to use vae for the image encoder/decoder? How to set it up?

vae = AutoencoderKL.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    subfolder="vae",
)

model = Transfusion(
    num_text_tokens = 1,
    dim_latent = 384,
    channel_first_latent = True,
    modality_default_shape = (4, 4),
    modality_encoder = vae.encoder,
    modality_decoder = vae.decoder,
    transformer = dict(
        dim = 512,
        depth = 8
    )
)

Answer 1 · 2024-11-17T16:03:14.000Z

@dingkwang i'll work on a simplified autoencoder training wrapper over at vq-pytorch, then allow for easy integration here

let's keep this open until i finish that (and as a reminder)

Answer 2 · 2024-11-18T17:43:57.000Z

@dingkwang hey Dingkang

do you want to read this and see if it is self-explanatory? i'll embark on that autoencoder wrapper this week as well

Answer 3 · 2024-11-19T19:16:52.000Z

@dingkwang think it is working given #22