RuntimeError in loading state_dict when calling CLVPMetric(device='cuda')
Opened this issue · 5 comments
a1rishav commented
Error details:
RuntimeError: Error(s) in loading state_dict for CLVP:
Missing key(s) in state_dict: "text_pos_emb.weight", "text_transformer.layers.layers.0.0.scale" ....
..... mismatch for to_speech_latent.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([512, 512]).
neonbjb commented
Ah apologies - this repo uses the original CLVP: https://huggingface.co/jbetker/tortoise-tts-v2/blob/main/.models/clvp.pth
It would be better if it were updated to use the bigger one.
a1rishav commented
Thanks for getting back, I am still getting the same error with the older clvp.
neonbjb commented
That doesn't seem right; the error should at least be different (CLVP1 has
d_model=512, CLVP2 has d_model=1024)
…On Mon, Aug 28, 2023 at 10:47 PM Rishav Kumar ***@***.***> wrote:
Thanks for getting back, I am still getting the same error with the older
clvp.
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGLMOSFX3PZH6WLJRDXTKDXXVX6RANCNFSM6AAAAAA4BJRRAI>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
-
James Betker
a1rishav commented
Yeah, you are right, the current error isn't exactly same as previous, I meant that it is still about the size mismatch
RuntimeError: Error(s) in loading state_dict for CLVP:
Missing key(s) in state_dict: "text_pos_emb.weight", "text_transformer.layers.layers.0.0.scale", "text_transformer.layers.layers.0.0.fn.norm.weight", .......
......size mismatch for text_emb.weight: copying a param with shape torch.Size([256, 512]) from checkpoint, the shape in current model is torch.Size([148, 512]).
WhiteTeaDragon commented
Has this issue been solved? I get the same mistake as the previous commenter