shape '[1, 1, 108607]' is invalid for input of size 217214
azman-i opened this issue · 2 comments
azman-i commented
Hi,i am trying to run flowtron on bangla TTS dataset.i have changed symbols.py(bangla symbols included),cmudict.py(bangla word and phoneme dict) and config.json(pointed to bangla dataset).Now i am facing this error.What can be the possible cause for this error?
After running:python train.py -c config.json -p train_config.output_directory=outdir data_config.use_attn_prior=1
Epoch:` 0
/home/azman/texttospeech/Flowtron/flowtron/data.py:56: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)
return torch.from_numpy(data).float(), sampling_rate
Traceback (most recent call last):
File "train.py", line 415, in <module>
train(n_gpus, rank, **train_config)
File "train.py", line 281, in train
for batch in train_loader:
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
data = self._next_data()
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/azman/miniconda3/envs/ming_tts/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/azman/texttospeech/Flowtron/flowtron/data.py", line 177, in __getitem__
mel = self.get_mel(audio)
File "/home/azman/texttospeech/Flowtron/flowtron/data.py", line 153, in get_mel
melspec = self.stft.mel_spectrogram(audio_norm)
File "/home/azman/texttospeech/Flowtron/flowtron/audio_processing.py", line 130, in mel_spectrogram
magnitudes, phases = self.stft_fn.transform(y)
File "/home/azman/texttospeech/Flowtron/flowtron/audio_processing.py", line 214, in transform
input_data = input_data.view(num_batches, 1, num_samples)
RuntimeError: shape '[1, 1, 108607]' is invalid for input of size 217214
```
letrongan commented
Can you share your config file ? @azman63
azman-i commented
i have found the solution.You have to change your audio file channel to mono from stereo incase you have some files with stereo channel.
https://stackoverflow.com/questions/5120555/how-can-i-convert-a-wav-from-stereo-to-mono-in-python