Example of using VAE for image.
Closed this issue · 3 comments
dingkwang commented
Does anyone succeed to use vae for the image encoder/decoder? How to set it up?
vae = AutoencoderKL.from_pretrained(
"CompVis/stable-diffusion-v1-4",
subfolder="vae",
)
model = Transfusion(
num_text_tokens = 1,
dim_latent = 384,
channel_first_latent = True,
modality_default_shape = (4, 4),
modality_encoder = vae.encoder,
modality_decoder = vae.decoder,
transformer = dict(
dim = 512,
depth = 8
)
)
lucidrains commented
@dingkwang i'll work on a simplified autoencoder training wrapper over at vq-pytorch, then allow for easy integration here
let's keep this open until i finish that (and as a reminder)
lucidrains commented
@dingkwang hey Dingkang
do you want to read this and see if it is self-explanatory? i'll embark on that autoencoder wrapper this week as well
lucidrains commented
@dingkwang think it is working given #22