Audio Generation is Slow

Question

Audio Generation is Slow

MarcoFerreiraPerson opened this issue 2 months ago · 2 comments

MarcoFerreiraPerson commented 2 months ago

Hello,

Love what you guys have done, however when I run it on a RTX 4090 GPU or on an A100 and I get similar inference speeds for audio streaming. Any ideas on why this is the case?

I think the speech portion might be running on the CPU, do you guys have any suggestions on how to move it to GPU or any other solutions?

Thank you

Answer 1 · 2024-10-18T05:00:25.000Z

yes, it's weird, but the speed of 3090 is also similar. But the is used in all the codes, so all the model should run on GPU, thank you!

Answer 2 · 2024-10-19T03:31:16.000Z

hi, @MarcoFerreiraPerson, what do you mean the speech portion? We may check on that later.
For the inference speed, the current model is the original fp32 version, and we haven't performed in-depth inference optimization, so the speed is not optimal at present.