TTS synthesis via pip module throws torch exception for short phrases
mischief opened this issue · 1 comments
mischief commented
i've installed the module via pip:
pip3 install --force-reinstall --user --upgrade TTS
however, TTS synthesis fails for short phrases or words i have tried:
$ tts --text "whatever" --model_name tts_models/en/ljspeech/speedy-speech 2>&1 | xclip
> tts_models/en/ljspeech/speedy-speech is already downloaded.
> vocoder_models/en/ljspeech/hifigan_v2 is already downloaded.
> Using model: speedy_speech
> Vocoder Model: hifigan
> Generator Model: hifigan_generator
> Discriminator Model: hifigan_discriminator
Removing weight norm...
> Text: whatever
> Text splitted to sentences.
['whatever']
Traceback (most recent call last):
File "<my home dir>/.local/bin/tts", line 8, in <module>
sys.exit(main())
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/bin/synthesize.py", line 257, in main
wav = synthesizer.tts(args.text, args.speaker_idx, args.speaker_wav)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/utils/synthesizer.py", line 228, in tts
outputs = synthesis(
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/utils/synthesis.py", line 281, in synthesis
outputs = run_model_torch(model, text_inputs, speaker_id, style_mel, d_vector=d_vector)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/utils/synthesis.py", line 92, in run_model_torch
outputs = _func(
File "<my home dir>/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/models/forward_tts.py", line 590, in inference
o_en, x_mask, g, _ = self._forward_encoder(x, x_mask, g)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/models/forward_tts.py", line 357, in _forward_encoder
o_en = self.encoder(torch.transpose(x_emb, 1, -1), x_mask)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/feed_forward/encoder.py", line 161, in forward
o = self.encoder(x, x_mask)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/feed_forward/encoder.py", line 71, in forward
o = self.res_conv_block(o, x_mask)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 124, in forward
o = block(o)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 79, in forward
return self.conv_bn_blocks(x)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 42, in forward
o = self.conv1d(x)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 301, in forward
return self._conv_forward(input, self.weight, self.bias)
File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 297, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size
this occurs for other short phrases i have tried.
stale commented
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discourse page for further help. https://discourse.mozilla.org/c/tts