gpustack/vox-box

Audio distortion generated using the bark model

Opened this issue · 1 comments

After selecting speaker, the generated audio is incorrect.

For example, the audio generated with 'v2/en_speaker_3' sometimes comes out as a female voice, when it should have been a male voice. And sometimes the audio is distorted.

Environment:

  • vox-box version: 0.0.2
  • model: Hugging Face/suno/bark