Speech

Question

Speech

deepconsc opened this issue 3 years ago · 4 comments

deepconsc commented 3 years ago

Great work!

Planning to run ff transformer network in speech domain overnight with madgrad.
Any heads up?

Answer 1 · 2021-03-31T18:27:26.000Z

My main suggestion would be to make sure you try less weight decay then you would normally use (if any)

Answer 2 · 2021-03-31T18:40:43.000Z

No weight decay it is:) Thank you!
Will provide feedback after initial run.

Answer 3 · 2021-04-01T19:37:30.000Z

@adefazio It worked really well during pretraining. After discriminator was enabled, it didn't show the progress with same rates as pure Adam. I have to mention - during pretraining the madgrad actually did better job than Adam - helped to model pitch, energy and duration really well.
It's obvious GAN-based training would need better tuning of madgrad, but it looks promising!

Answer 4 · 2021-04-01T19:49:23.000Z

Thanks for the info! Interesting result.