L0SG/WaveFlow

online speech syntensize and server code

lalimili6 opened this issue · 1 comments

Hi
can share your test waves? are they like https://waveflow-demo.github.io/?
another question, Is there any server synthesizer?
Do you compare of time of synthesizing with tacatron (like Mozilla)? Is it faster?
best regards.

L0SG commented

Sorry for the late reply. Here's a zipped waveform sample of the trained model with 128 residual channels. I guess it's similar to their results.

Since I'm currently a graduate student, there's no resource for the deployment server. You may need to clone the repository and try training the model with the released code.

Currently, the model (with height=8 and 64 channels) hits ~93kHz from V100 which is around 2x slower than the results from the paper I believe. Not sure why, but maybe there remain some redundant ops from the current implementation.

If the sampling speed is the primary target, you may be interested in LPCNet from Mozilla which also provided heavily optimized codebase.