Generic Text-to-Speech Inference
GreenGarnets opened this issue · 1 comments
GreenGarnets commented
I understood that Mellotron puts audio or musicXML on the result of synthesis based on Tacotron2 and gives StyleTransfer accordingly. By the way, if there is no reference file here, can't I just bring the general TTS composite result? I looked at the code section of model.py, but I'm asking because I don't think it's relevant.
rafaelvalle commented
Please re-phrase your question as it is not clear to me what your question is.