auspicious3000/SpeechSplit

Could you please describe details of rhythm-only conversion ?

dbkest opened this issue · 1 comments

I don't understand how to get alignment when the input(utterance) to the rhythm-encoder is different from inputs(utterance) to pitch/content-encoders. ps(I don't understand the implementation details of variant in Appendix B.3). thank you, sincerely.

you can understand it by reading the code for pitch conversion