VoiceBox Training

Question

VoiceBox Training

yiwei0730 opened this issue a year ago · 3 comments

If I want to train this package of models, do I need to run spear-tts first to obtain the text-to-semantic model before running voicebox, or can I directly run the voicebox semantic model and train the main model together?

Answer 1 · 2023-10-11T13:49:01.000Z

yes, at the moment it requires three models across three repositories. so unless you are an exceptional engineer or scientist (like Lucas), you will have trouble getting it all working in concert. this isn't something that works just by running a script just yet

give me more time to think about how to weave this all together

Answer 2 · 2023-10-11T13:50:54.000Z

@yiwei0730 on the other hand, if you want to test out unconditional training, then you should be able to get working quite easily with just the base model in this repository alone

Answer 3 · 2023-10-11T13:53:11.000Z

to answer your original question, you need a trained text-to-semantic model from spear-tts, which requires yet another 3 step training process