Error while Training
H41907 opened this issue · 5 comments
Hi all
update: I'm trying to train SD 1.5 (the inpaint version)
Getting the following error message
RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
size mismatch for model.diffusion_model.input_blocks.0.0.weight: copying a param with shape torch.Size([320, 9, 3, 3]) from checkpoint, the shape in current model is torch.Size([320, 4, 3, 3]).
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/Dreambooth-Stable-Diffusion/main.py", line 896, in
if trainer.global_rank == 0:
NameError: name 'trainer' is not defined. Did you mean: 'Trainer'?
i get the same error on sd1.4
I went to line 896 on Main.py, and changed "trainer" for "Trainer", now it's working
@H41907 were you able to get 1.5 to work ? im using the model from runwayml/stable-diffusion-v1-5 and it is constantly erroring out. could you point out how u got it to work ?
No, not resolved. I also stopped looking into it: at the time I was thinking that the inpainting model is different to the standard SD model (1.4/1.5) and therefore not useable for Dreambooth.
size mismatch for model.diffusion_model.input_blocks.0.0.weight: copying a param with shape torch.Size([320, 9, 3, 3]) from checkpoint, the shape in current model is torch.Size([320, 4, 3, 3]).
I did not continue looking into it, maybe nowadays there is a solution or tool which solves the issue.
@H41907 same here. i moved to the Diffuser way of doing things. got it to work well. actually the Huggingface "accelerate" libraries make training much faster with 1.5.
here's my simple notebook on this - https://github.com/sandys/SimpleDiffuserDreambooth