Adding another speaker

Question

Adding another speaker

JakubReha opened this issue 5 years ago · 5 comments

I am trying to train the pre-trained model LibriTTS with one more speaker. I've added around 15 minutes of audio from this speaker to the train-clean-100 dataset, added the transcription in 85:15 ratio (train:validation) and increased the number of speakers to 124 in hparams.py. Also all the audio files were resampled to 22 050 Hz, 16 bit. But when I run the inference on the checkpoints I get only noise for all the speakers.

Answer 1 · 2020-05-08T11:07:29.000Z

Check that the files are definitely 16 bit and have similar volume to the other speakers.
The Source Mel should have more detail. Like this

Answer 2 · 2020-05-08T12:52:29.000Z

They are 16 bit and the volume of the extra speaker is slightly higher, but the thing is that the source mel is still the same regardless of the speaker.

Answer 3 · 2020-05-09T00:05:04.000Z

@JakubReha
The audio file path for the source mel is printed here. Would you be able to upload and/or check that audio file?

Answer 4 · 2020-05-09T01:39:19.000Z

@JakubReha The mel-spectrogram looks suspicious. Can you share an audio file?

Answer 5 · 2020-07-09T21:37:49.000Z

Closing due to inactivity.