xcmyz/FastSpeech

duration loss calulated in log domain or linear domain

MorganCZY opened this issue · 0 comments

I notice the original implementaion of fastspeech(integrated in ESPNet) adopts log domain to calculate the duration loss, which means target duration is first token the logarithm. In your version, the linear domain is used to directly calculate duration loss. Have you any ideas on both methods?