Parallel Wavenet-Vocoder

Question

Parallel Wavenet-Vocoder

twidddj opened this issue 7 years ago · 14 comments

twidddj commented 7 years ago

Planed TODO

KL + Power - Single speaker

Properties not specified in the paper

Sampling number for the loss (We may have some limitation for GPU)
Number of mixture for IAF layers
Averaging method for Power loss
- ex) Just reduce_mean on time axis or using moving average or ..
.. (Please, let us know those)

Another implementations

https://github.com/zhf459/P_wavenet_vocoder (used r9y9's wavenet in pytorch)

Answer 1 · 2018-04-25T12:58:41.000Z

sadly, there are many details behind the paper, i find nobody can reproduce result.

Answer 2 · 2018-04-25T14:05:35.000Z

Most of all, I'm not sure mel-spectrogram would fit well to reproduce result like the linguistic features in the paper. We may have to consider another constraints to make up for the weakness of mel-features.

Answer 3 · 2018-04-26T02:31:47.000Z

I do not think mel-features is the key problem of me, i think the iaf and probability Density Distillation is very very import for the quality.

Answer 4 · 2018-04-26T02:44:34.000Z

@neverjoe do you tell me how to connect teacher model to eval student !Are they all trained in one sess! thanks

Answer 5 · 2018-04-26T02:48:28.000Z

U just need to read the paper, the paper tell u clearly. maozhiqiang <notifications@github.com>于2018年4月26日周四上午10:44写道：

@neverjoe <https://github.com/neverjoe> do you tell me how to connect teacher model to eval student !Are they all trained in one sessiotion! thanks — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABsFmPaWAZz6iZtoo8y2Uhc1fpbsAUFUks5tsTSSgaJpZM4Te5J2> .

-- Sent by Inbox

Answer 6 · 2018-04-28T08:19:55.000Z

hi ! How to let teachers' network parameters do not participate in training? thanks

Answer 7 · 2018-04-28T16:28:18.000Z

@maozhiqiang
"tf.stop_gradient" in tensorflow

Answer 8 · 2018-04-29T01:24:11.000Z

thank you @xuerq

Answer 9 · 2018-05-01T01:41:44.000Z

@xuerq have u got quite quality wav by parallel wavenet？ maozhiqiang <notifications@github.com>于2018年4月29日周日上午9:24写道：

thank you @xuerq <https://github.com/xuerq> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABsFmIqA5wI8f3bW5WHNCv3iewpAuHEuks5ttRY8gaJpZM4Te5J2> .

-- Sent by Inbox

Answer 10 · 2018-05-01T15:12:28.000Z

@neverjoe still working on it，the wav sampled from student is not so good as teacher

Answer 11 · 2018-05-08T03:56:02.000Z

@xuerq ,@neverjoe @twidddj how to use teacher model to eval students output? Is it use training process to assessment or using generative process to assessment? thanks!

Answer 12 · 2018-06-13T09:34:40.000Z

@twidddj do you get reasonable result about paralle wavenet!

Answer 13 · 2018-06-14T02:31:15.000Z

Hi @maozhiqiang, we are still trying to get better result of it. We have got some results, but It's not enough. I attached the results here. Thanks for your attention of our project!

Answer 14 · 2018-06-14T06:58:43.000Z

hi @twidddj ! Thank you very much for your reply！ do you used KL loss?