About Guided Attention loss

Question

About Guided Attention loss

Opened this issue 3 years ago · 2 comments

Hello . I try to use your model to finish melspec to melspec conversion task.

I try to add a guided attention like this :

dal_loss = (hp.lamda_attn_dal * dal_g * attn_probs).abs().mean()
dal_g = guided_attention(Nt, Ns) ## This function is included in your code ,but not used .

And I found that the loss can not decay while seq loss decline fastly . And my train can not converge.

I wish you help ,please！Thank you.

Answer 1 · 2023-03-31T08:30:51.000Z

Hi, did you get this working finally? While training on LJSpeech dataset, I notice that the diagonal alignment doesn't appear in decoder attention and encoder-decoder attention, but only in the encoder attention around 160K iterations. Please let know if you had similar issues. Many thanks!

Answer 2 · 2023-03-31T08:47:43.000Z

sorry , I abandoned this project long ago. >_<