Feature Request: Legacy & New Encoder
soten355 opened this issue · 1 comments
soten355 commented
Stable Diffusion 2.0 uses a new text encoder, so the PyTorch mapping for that model and any future models won't work any more. It's beyond my expertise, but can we write into clip_encoder.py the ability to create the text encoder model for the new version of SD2.0?
Creating a choice for legacy or new encoder is a simple bool variable that can be passed, but I have no clue on how to create the new text encoder.
soten355 commented
I believe I got SD2.x 512 to work. Had to re-work the UNet model parameters and completely convert the CLIP encoder to OpenCLIP. In my repo, the user has the option of using legacy versions of SD (pre-2.0) or contemporary version (2.x) because the models are entirely different.