About Guided Attention loss
Opened this issue · 2 comments
Hello . I try to use your model to finish melspec to melspec conversion task.
I try to add a guided attention like this :
dal_loss = (hp.lamda_attn_dal * dal_g * attn_probs).abs().mean()
dal_g = guided_attention(Nt, Ns) ## This function is included in your code ,but not used .
And I found that the loss can not decay while seq loss decline fastly . And my train can not converge.
I wish you help ,please!Thank you.
Hi, did you get this working finally? While training on LJSpeech dataset, I notice that the diagonal alignment doesn't appear in decoder attention and encoder-decoder attention, but only in the encoder attention around 160K iterations. Please let know if you had similar issues. Many thanks!
sorry , I abandoned this project long ago. >_<