Unstable audio generation towards the end of longer sentences using interference.py

Question

Unstable audio generation towards the end of longer sentences using interference.py

Closed this issue 3 years ago · 3 comments

Using interference.py i get unstable audio towards the end of longer sentences, sample audio LongSentence

Answer 1 · 2020-10-25T16:17:47.000Z

Thanks. Could you also append the text of the audio? Thanks

Answer 2 · 2020-10-26T10:47:47.000Z

Sure, the text of the audio is as follows "His father, Alexander Sascha Schapiro (also known as Alexander Tanaroff), had Hasidic Jewish roots and had been imprisoned in Russia before moving to Germany in 1922, while his mother, Johanna Hanka Grothendieck, came from a Protestant family in Hamburg and worked as a journalist."

Answer 3 · 2021-03-24T08:24:55.000Z

Thanks for the reply and sorry for my late response. This problem is something that also other people reported.
I will probably not fix this for SpeedySpeech 1, but I will address it in the upcoming version. :)
If you want to synthesize longer sentences, you can simply split the sentence by replacing comma with dot in the middle of the sentence and the result should be reasonable.