TTS

Question

TTS

yukiarimo opened this issue 5 months ago · 8 comments

Hello. Do you know how to turn this: https://github.com/nivibilla/build-nanogpt into TTS instead of audio-to-audio?

Answer 1 · 2024-08-15T11:45:50.000Z

Hey @yukiarimo , I am trying todo that too, is there any progress on you side on this? I made some progress on audio to audio

at first it was just noise
then reduced noise
now, no noise but bird voices I guess.
working on next thing to upgrade it, so might be posting here about it...,

if you are interested to work on it with me, let me know.

thanks

Answer 2 · 2024-08-15T11:50:33.000Z

So, I also found 2 things

Enjoy!! :)

Answer 3 · 2024-08-15T13:58:11.000Z

Gonna try it out! But how is that “without tokenizer”?

Answer 4 · 2024-08-15T14:01:36.000Z

I think you are talking about audio-to-audio, so for that I build my own tokenizer hehe :'D

Answer 5 · 2024-08-15T14:20:36.000Z

So, the concept behind the tokenizer is batches of data. Convert the combined audio say for 50MB for now; to mel spectrogram, encode the mel spectrogram into a sequence of integers and decode the sequence of integers back into the mel spectrogram. The mel spectrogram values are scaled and quantized to a range of integers. The encoding and decoding process maps these integers back and forth between the mel spectrogram values.

and in more general words, like at sec 1 we have encoded some kind of Mel spectrogram data. like we had for:

input: print(encode("hii there"))
output: [46, 47, 47, 1, 58, 46, 43, 56, 43]
input: print(decode(encode("hii there")))
output: hii there

Let me know if you can contribute on top of this, thanks.

Answer 6 · 2024-08-17T10:05:33.000Z

@Momnadar1 https://github.com/tttzof351/SimpleTransfromerTTS doesn't work

Answer 7 · 2024-08-17T10:13:16.000Z

I will send you the Colab link on this, where it’s working for me . Thanks

Answer 8 · 2024-08-17T11:45:47.000Z

Hi, @yukiarimo here is the link: https://colab.research.google.com/drive/1NHFi8y1GCIUR4Nv0yguGVwOk2q0-JOEu?usp=sharing.

But take a look on attached images of train and test loss etc on this https://github.com/tttzof351/SimpleTransfromerTTS. It shows you nearly take 400K iteration to generate good results.

If still issues just let me know.

Thanks,