用这里的模型跑出现这个RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).
wangkewk opened this issue · 79 comments
谁能解决
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py
第11行的内容 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
放心
同样的问题!
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,这是有效的。修改过之后,原来的纯杂音变成正常声音了
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,问题解决了
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,已经解决
同样问题
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
感谢!问题已顺利解决。
一样!
修改后完全正常,thanks~
+1
修改后正常了,感谢
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。
如果想要声音特别像某个人的声音,要怎么提高呢?
同样的问题。
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常
改回去效果会好一点 但是不改也可以工作的
总算可以了,这个问题搞了好久,还以为本地安装的环境问题
+1
出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?
出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?
不是的,可能是vocoder或者输入音频不同导致的
+1
唉,还是没有视频中的效果,听起来像刚来**的老外的塑料中文
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?
我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?
我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文
如果录音清晰,平调情况下音色复制效果还是可以的,是不是哪里没运行好?
已修改synthesizer/utils/symbols.py
,还是出现报错
Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
func = lambda: self.synthesize() or self.vocode()
File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
self.load()
File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
self._model.load(self.model_fpath)
File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
self.load_state_dict(checkpoint["model_state"], strict=False)
File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
Me too
试着用这个模型:
链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw
提取码:om7f
--来自百度网盘超级会员V3的分享
试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享
可以运行起来了,但是生成的句子只有前半是读出来的,后半句都是杂音,多生成几次有时会好点有时又会倒退回去,而且生成的声音和原音频不像,差的有点远的那种,哈哈
试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享
这个模型没问题,把_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!'(),-.:;? '改回原来的就行了
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
跑的步数很少,可以延续跑到100k+
ceshi的模型需要将代码切换到10月20号左右的commit之后,再按issue #37 修改之后就可以用了
而作者的模型,需要将代码切换到10月20号左右的commit之后使用
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
跑的步数很少,可以延续跑到100k+
是不断的点synthesize only之后,输出的声音就会越来越好吗?
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的内容 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
改了之后还是没用,,,,希望再看看
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
我觉得你这个估计是一开始你复制了模型到你的程序里面去了,重新解压一下那个程序的压缩包,然后重新来就可以了
为什么我的源音频是黑色的,有大佬知道吗?
源音频的Dataset和Speaker这些都是黑的,不能选择?
没有被识别的数据集 不训练的话就不用理会了
大佬,是不是如果要克隆自己的声音的话,需要对自己做音源进行训练,而不能直接用community给的那些模型。昨天用给的模型(包括synthesizer和vector)克隆自己的录音,结果出来的梅尔频谱图是杂乱的,只有一堆电流声和噪声,求大佬指正错误
可以直接通过 quickstart (https://github.com/babysor/MockingBird/wiki/Quick-Start-(Newbie))改用该模型,相关代码可以无需修改 ; 环境 3.7.11
整篇评论都看了,raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(。生成的都是杂音,代码也照着改了都不行。换模型也不行。。。
同样的错误copying a param with shape torch.Size([128, 512]) ,输出的声音全部是杂音
纯萌新,请教一下切换到tag0.01怎么切换啊?完全没理解。
自己拿75k的训练了一阵目标语音,感觉模仿的声音还是不像,想换这个模型再训练试试
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
只输出杂音,按照评论来改了还是一样
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
版本先切换,再应用#37
请问如何切换版本?
Same issue
改完一样报错,看起来又有新的问题,,,,,,,,,
File "E:\语音克隆\MockingBird\synthesizer\models\tacotron.py", line 564, in load
self.load_state_dict(checkpoint["model_state"], strict=False)
File "E:\anaconda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron2:
size mismatch for embedding.weight: copying a param with shape torch.Size([148, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).
怎么解决呀,我改了之后后面的70变化了
帮帮孩子吧
我这个是拿nVidia那个改了一点,为什么前面是148,怎么修改这个值
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的内容 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
改了 没用啊
已修改,还是出现报错
synthesizer/utils/symbols.py
Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
同样的报错 你那个好了吗?
同样的报错 你那个好了吗?
同样的报错 你那个好了吗?
同问 切换到v0.0.1依然不行 (已加修复) pytorch是最新版 cuda11.7
按说明修改后还是没用,一样报错。一定要自己训练吗?
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的内容 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
快两年了,这个还会兼容吗 。。
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "D:\ProgramData\Anaconda3\envs\voiceClone\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
exec(code, module.dict)
File "C:\Users\zhanglong\AppData\Local\Temp\tmp342r_iv9.py", line 13, in
render_streamlit_ui()
File "H:\MockingBird\MockingBird\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "H:\MockingBird\MockingBird\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
File "H:\MockingBird\MockingBird\control\mkgui\app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
File "H:\MockingBird\MockingBird\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "H:\MockingBird\MockingBird\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "H:\MockingBird\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "D:\ProgramData\Anaconda3\envs\voiceClone\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
Traceback (most recent call last):
File "D:\codeinstall\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "C:\Users\zy820\AppData\Local\Temp\tmpegto92vs.py", line 13, in
render_streamlit_ui()
File "D:\develop\workspace-project\MockingBird\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "D:\develop\workspace-project\MockingBird\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
self._model.load(self.model_fpath, self.device)
File "D:\develop\workspace-project\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "D:\codeinstall\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).
size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
按照上面的解决方法改了还是报错,有新的解决方案没
Traceback (most recent call last):
File "G:\PycharmProjects\MockingBird\control\toolbox_init_.py", line 260, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token, steps=int(self.ui.length_slider.value())*200)
File "G:\PycharmProjects\MockingBird\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "G:\PycharmProjects\MockingBird\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "G:\PycharmProjects\MockingBird\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\Users\Admin.conda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).
size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
QWindowsWindow::setGeometry: Unable to set geometry 1992x1001+0+29 (frame: 2010x1048-9-9) on QWidgetWindow/"UIClassWindow" on "\.\DISPLAY1". Resulting geometry: 1920x1001+0+29 (frame: 1938x1048-9-9) margins: 9, 38, 9, 9 minimum size: 1992x583 MINMAXINFO maxSize=0,0 maxpos=0,0 mintrack=2010,630 maxtrack=0,0)
你好,除了一个75k steps 的合成器正常运行了,25k 150k 200k 的均出现类似的错误,这个是加载mandarin_200k.pt的合成器时候的报错,到现在还有解决方案吗?谢谢
上面提供改的所有方案都试过了全部没用,不知道真正导致数据不同步的错误在运行环境哪里
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容感谢!问题已顺利解决。
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "C:\Users\Admin\AppData\Local\Temp\tmpsj6156uv.py", line 13, in
render_streamlit_ui()
File "E:\GithubProjects\MockingBird-main\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
File "E:\GithubProjects\MockingBird-main\control\mkgui\base\core.py", line 203, in call
return self.function(input_obj, **kwargs)
File "E:\GithubProjects\MockingBird-main\control\mkgui\app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "E:\GithubProjects\MockingBird-main\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
按照这个修改了MockingBird-main\models\synthesizer\utils目录下面的symbols.py文件里面的第11行代码,但是依旧还是报错,不知道什么原因?
在我实际使用中发现,如果出现尺寸不匹配的问题,有说是输入框文字切割的问题,原始仓库Real-Time-Voice-Cloning也会出现这个问题。
但是多点击几次好像就不报这个错误,但是输出的音频还是以杂音为主
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容感谢!问题已顺利解决。
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]). Traceback: File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script exec(code, module.dict) File "C:\Users\Admin\AppData\Local\Temp\tmpsj6156uv.py", line 13, in render_streamlit_ui() File "E:\GithubProjects\MockingBird-main\control\mkgui\base\ui\streamlit_ui.py", line 909, in render_streamlit_ui session_state.output_data = opyrator(input=input_data_obj) File "E:\GithubProjects\MockingBird-main\control\mkgui\base\core.py", line 203, in call return self.function(input_obj, **kwargs) File "E:\GithubProjects\MockingBird-main\control\mkgui\app.py", line 140, in synthesize specs = current_synt.synthesize_spectrograms(texts, embeds) File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms self.load() File "E:\GithubProjects\MockingBird-main\models\synthesizer\inference.py", line 69, in load self._model.load(self.model_fpath, self.device) File "E:\GithubProjects\MockingBird-main\models\synthesizer\models\base.py", line 55, in load self.load_state_dict(state, strict=False) File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
按照这个修改了MockingBird-main\models\synthesizer\utils目录下面的symbols.py文件里面的第11行代码,但是依旧还是报错,不知道什么原因?
同 不知道解决了吗
修改后还是报错:
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
Traceback:
File "/Users/ywy/Library/Python/3.11/lib/python/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "/private/var/folders/53/3r03mt7d4v9bsvhljvnd_zs80000gn/T/tmpo53ek00n.py", line 13, in
render_streamlit_ui()
File "/Users/ywy/MockingBird/control/mkgui/base/ui/streamlit_ui.py", line 909, in render_streamlit_ui
session_state.output_data = opyrator(input=input_data_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/control/mkgui/base/core.py", line 203, in call
return self.function(input_obj, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/control/mkgui/app.py", line 140, in synthesize
specs = current_synt.synthesize_spectrograms(texts, embeds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ywy/MockingBird/models/synthesizer/inference.py", line 91, in synthesize_spectrograms
self.load()
File "/Users/ywy/MockingBird/models/synthesizer/inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "/Users/ywy/MockingBird/models/synthesizer/models/base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
Traceback (most recent call last):
File "D:\python\tts\chatgpt\tts\MockingBird-main\control\toolbox_init_.py", line 144, in
func = lambda: self.synthesize() or self.vocode()
^^^^^^^^^^^^^^^^^
File "D:\python\tts\chatgpt\tts\MockingBird-main\control\toolbox_init_.py", line 260, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token, steps=int(self.ui.length_slider.value())*200)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\inference.py", line 91, in synthesize_spectrograms
self.load()
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\inference.py", line 69, in load
self._model.load(self.model_fpath, self.device)
File "D:\python\tts\chatgpt\tts\MockingBird-main\models\synthesizer\models\base.py", line 55, in load
self.load_state_dict(state, strict=False)
File "C:\ProgramData\anaconda3\envs\pytorch\Lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]).
size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
同修改后还是报错
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
同修改后还是报错
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
相同问题
: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([75, 512]) from checkpoint, the shape in current model is torch.Size([70, 512]). size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]). 改了依旧报错,有小伙伴解决的吗?
有最新的解决办法吗,改完还是不行,只有0.0.1的可以
同样问题。修改后还是不行,有没有解决办法?
RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([512, 512]).