Error while training
Closed this issue · 5 comments
Hello! I am trying to train vits using the same dataset and recipes you have provided exactly, only thing I have changed is the paths in the recipes for the datasets. But for some reason I am getting this error:
/workspace/TTS/TTS/tts/models/vits.py:1454: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3571.)
test_figures["{}-alignment".format(idx)] = plot_alignment(alignment.T, output_fig=False)
! Run is removed from /workspace/Persian-tts-coqui/recepies/vits/vits_fa_female-April-24-2023_09+16AM-9a4e5f8
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/trainer/trainer.py", line 1591, in fit
self._fit()
File "/opt/conda/lib/python3.8/site-packages/trainer/trainer.py", line 1548, in _fit
self.test_run()
File "/opt/conda/lib/python3.8/site-packages/trainer/trainer.py", line 1466, in test_run
test_outputs = self.model.test_run(self.training_assets)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/workspace/TTS/TTS/tts/models/vits.py", line 1454, in test_run
test_figures["{}-alignment".format(idx)] = plot_alignment(alignment.T, output_fig=False)
File "/workspace/TTS/TTS/tts/utils/visual.py", line 18, in plot_alignment
im = ax.imshow(
File "/opt/conda/lib/python3.8/site-packages/matplotlib/__init__.py", line 1447, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "/opt/conda/lib/python3.8/site-packages/matplotlib/axes/_axes.py", line 5523, in imshow
im.set_data(X)
File "/opt/conda/lib/python3.8/site-packages/matplotlib/image.py", line 711, in set_data
raise TypeError("Invalid shape {} for image data"
TypeError: Invalid shape (10,) for image data```
Could you help me figure out what the issue is? Thank you, and great work on the model!
Hi ,
1- could you please tell more info about your environment .Please send result of this code:
wget https://raw.githubusercontent.com/coqui-ai/TTS/main/TTS/bin/collect_env_info.py
python collect_env_info.py
2- Do you using jupyter notebook to run codes?If yes please save your notebook with all outputs and send it here.
Hello! I can tell you about the environment of course, but that python script "collect_env_info.py" does not run. I am running this on a docker container, which is running on a Linux server with multiple GPUs linked to the container. Thanks!
OK. please send full execution log.
Probably it is because of your installed packages version, I dont know too much about docker but these resources could help you :
It was in fact because of the package version! Thank you for the help. I had another question regarding phonemizing the dataset, how did you phonemize your dataset? any specific tools? Thank you
No , I just used espeak as phonemizer as you can see in my config use_phonemes=True, uses eSpeak as default phonemizer .You can change phonemizer ,
- https://tts.readthedocs.io/en/latest/models/vits.html?highlight=phonemizer#vitsconfig
- https://tts.readthedocs.io/en/latest/implementing_a_new_language_frontend.html?highlight=phonemizer#implementing-a-new-language-frontend
But there are some good libs for that , like persian_phonemizer
and also mimic-3 has a method that also solves kasre ezafe (Ezāfe) problem , here is the code for that _fix_words