neonbjb/tts-scores

RuntimeError in loading state_dict when calling CLVPMetric(device='cuda')

Opened this issue · 5 comments

Error details:

RuntimeError: Error(s) in loading state_dict for CLVP:
   Missing key(s) in state_dict: "text_pos_emb.weight", "text_transformer.layers.layers.0.0.scale" ....
   .....  mismatch for to_speech_latent.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]).

Ah apologies - this repo uses the original CLVP: https://huggingface.co/jbetker/tortoise-tts-v2/blob/main/.models/clvp.pth

It would be better if it were updated to use the bigger one.

Thanks for getting back, I am still getting the same error with the older clvp.

Yeah, you are right, the current error isn't exactly same as previous, I meant that it is still about the size mismatch
RuntimeError: Error(s) in loading state_dict for CLVP:

Missing key(s) in state_dict: "text_pos_emb.weight", "text_transformer.layers.layers.0.0.scale", "text_transformer.layers.layers.0.0.fn.norm.weight",  .......
......size mismatch for text_emb.weight: copying a param with shape torch.Size([256, 512]) from checkpoint, the shape in current model is torch.Size([148, 512]).

Has this issue been solved? I get the same mistake as the previous commenter