VoiceBox Training
yiwei0730 opened this issue · 3 comments
If I want to train this package of models, do I need to run spear-tts first to obtain the text-to-semantic model before running voicebox, or can I directly run the voicebox semantic model and train the main model together?
yes, at the moment it requires three models across three repositories. so unless you are an exceptional engineer or scientist (like Lucas), you will have trouble getting it all working in concert. this isn't something that works just by running a script just yet
give me more time to think about how to weave this all together
@yiwei0730 on the other hand, if you want to test out unconditional training, then you should be able to get working quite easily with just the base model in this repository alone
to answer your original question, you need a trained text-to-semantic model from spear-tts, which requires yet another 3 step training process