soobinseo/Transformer-TTS

About Guided Attention loss

Opened this issue · 2 comments

Hello . I try to use your model to finish melspec to melspec conversion task.

I try to add a guided attention like this :

dal_loss = (hp.lamda_attn_dal * dal_g * attn_probs).abs().mean()
dal_g = guided_attention(Nt, Ns) ## This function is included in your code ,but not used .

And I found that the loss can not decay while seq loss decline fastly . And my train can not converge.

I wish you help ,please!Thank you.

Hi, did you get this working finally? While training on LJSpeech dataset, I notice that the diagonal alignment doesn't appear in decoder attention and encoder-decoder attention, but only in the encoder attention around 160K iterations. Please let know if you had similar issues. Many thanks!

ywh-my commented

sorry , I abandoned this project long ago. >_<