lucidrains/voicebox-pytorch

Audio samples?

blx0102 opened this issue · 2 comments

Great work here!
It seem you have already combined voicebox with spear-tts, could you provide some result audio samples?

@blx0102 Lucas already has shared some early audio samples with me. Seems to work

@blx0102 It's still early days here as we dial in the training and inference, but here's an early sample with the prompt that was used for the semantic tokens. This is a 93M param model that's done about ~200k training steps @ effective batch size of 16 on LibriTTS-R.