mozilla/TTS

TTS synthesis via pip module throws torch exception for short phrases

mischief opened this issue · 1 comments

i've installed the module via pip:

pip3 install --force-reinstall --user --upgrade TTS

however, TTS synthesis fails for short phrases or words i have tried:

$ tts --text "whatever" --model_name tts_models/en/ljspeech/speedy-speech 2>&1 | xclip
 > tts_models/en/ljspeech/speedy-speech is already downloaded.
 > vocoder_models/en/ljspeech/hifigan_v2 is already downloaded.
 > Using model: speedy_speech
 > Vocoder Model: hifigan
 > Generator Model: hifigan_generator
 > Discriminator Model: hifigan_discriminator
Removing weight norm...
 > Text: whatever
 > Text splitted to sentences.
['whatever']
Traceback (most recent call last):
  File "<my home dir>/.local/bin/tts", line 8, in <module>
    sys.exit(main())
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/bin/synthesize.py", line 257, in main
    wav = synthesizer.tts(args.text, args.speaker_idx, args.speaker_wav)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/utils/synthesizer.py", line 228, in tts
    outputs = synthesis(
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/utils/synthesis.py", line 281, in synthesis
    outputs = run_model_torch(model, text_inputs, speaker_id, style_mel, d_vector=d_vector)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/utils/synthesis.py", line 92, in run_model_torch
    outputs = _func(
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/models/forward_tts.py", line 590, in inference
    o_en, x_mask, g, _ = self._forward_encoder(x, x_mask, g)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/models/forward_tts.py", line 357, in _forward_encoder
    o_en = self.encoder(torch.transpose(x_emb, 1, -1), x_mask)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/feed_forward/encoder.py", line 161, in forward
    o = self.encoder(x, x_mask)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/feed_forward/encoder.py", line 71, in forward
    o = self.res_conv_block(o, x_mask)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 124, in forward
    o = block(o)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 79, in forward
    return self.conv_bn_blocks(x)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/TTS/tts/layers/generic/res_conv_bn.py", line 42, in forward
    o = self.conv1d(x)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 301, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "<my home dir>/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 297, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size

this occurs for other short phrases i have tried.
stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discourse page for further help. https://discourse.mozilla.org/c/tts